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Abstract 

This  paper  reviews  the  ten  year  history  of  tutor  development  based  on  the  ACT  theory 
(Anderson,  1983,  1993).  We  developed  production  system  models  in  ACT  of  how 
students  solved  problems  in  LISP,  geometry,  and  algebra.  Computer  tutors  were 
developed  around  these  cognitive  models.  Construction  of  these  tutors  was  guided  by  a  set 
of  eight  principles  loosely  based  on  the  ACT  theory.  Early  evaluations  of  these  tutors 
usually  but  not  always  showed  significant  achievement  gains.  Best  case  evaluations 
showed  that  students  could  achieve  at  least  the  same  level  of  proficiency  as  conventional 
instruction  in  one-third  of  the  time.  Empirical  studies  showed  that  students  were  learning 
skills  in  production-rule  units  and  that  the  best  tutorial  interaction  style  was  one  in  which 
the  tutor  provides  immediate  feedback,  consisting  of  short  and  directed  error  messages. 
The  tutors  appear  to  work  better  if  they  present  themselves  to  students  as  non  human  tools 
to  assist  learning  rather  than  as  emulations  of  human  tutors.  Students  working  with  these 
tutors  display  transfer  to  other  environments  to  the  degree  that  they  can  map  the  tutor 
environment  into  the  test  environment.  These  experiences  have  coalesced  into  a  new 
system  for  developing  and  deploying  tutors.  This  system  involves  first  selecting  a 
problem-solving  interface,  then  constructing  a  curriculum  under  the  guidance  of  a  domain 
expert,  then  designing  a  cognitive  model  for  solving  problems  in  that  environment,  then 
building  instruction  around  the  productions  in  that  model,  and  finally  deploying  the  tutor  in 
the  classroom.  New  tutors  are  being  built  in  this  system  to  achieve  the  NCTM  standards 
for  high  school  mathematics  in  an  urban  setting. 
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Introduction 

Over  the  past  10  years,  our  research  group  (the  Advanced  Computer  Tutoring  Project  at 
Carnegie  Mellon  University)  has  been  developing  a  type  of  computer-based  instructional 
technology  which  we  call  cognitive  tutors.  The  core  commitment  at  every  stage  of  the 
work  and  in  all  applications  is  that  instruction  should  be  designed  with  reference  to  a 
cognitive  model  of  the  competence  that  the  student  is  being  asked  to  learn.  This  means  that 
the  system  possesses  a  computational  model  capable  of  solving  the  problems  that  are  given 
to  students  in  the  ways  students  are  expected  to  solve  the  problems.  As  will  be  elaborated, 
all  decisions  about  delivering  such  instruction  are  made  with  reference  to  that  model.  These 
systems  are  called  tutors  because  our  initial  work  on  them  was  inspired  by  the  intelligent 
tutoring  work  of  the  late  70s  and  early  80s  (e.g.,  Sleeman  &  Brown,  1982).  Indeed,  when 
we  embarked  on  the  project,  we  had  the  ill-defined  goal  that  our  systems  interact  with 
students  like  private  human  tutors.  While  we  de-emphasized  the  emulation  of  the  human 
tutors  over  the  years,  the  term  “tutor”  has  stuck. 

This  article  will  survey  our  work  on  tutoring.  It  will  describe  the  motivations  for  being 
involved  in  tutoring,  the  theoretical  assumptions  underpinning  the  work,  the  empirical 
evidence  for  the  claims,  and  the  current  directions  of  the  research.  This  overview  will  be 
organized  according  to  its  three  identifiable  stages:  a  flurry  of  tutor  building  in  the  mid- 
80s,  a  flurry  of  evaluations  in  the  late  80s,  and  a  current  effort  to  build  and  deploy  practical 
tutor  systems. 


Stage  1 :  Early  Tutor  Building 

1982  saw  the  completion  of  the  ACT*  theory  of  learning  and  problem  solving  which  was 
described  in  the  Architecture  of  Cognition  (Anderson,  1983).  Much  of  that  theory  was 
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concerned  with  the  acquisition  of  cognitive  skills  and  we  had  done  research  testing  the 
theory  in  the  domains  of  proof  generation  in  geometry  (Anderson,  Greeno,  Kline,  & 
Neves,  1981)  and  initial  programming  skills  in  LISP  (Anderson,  Farrell,  &  Sauers,  1984). 
The  theory  held  that  a  cognitive  skill  consists  in  large  part  of  units  of  goal-related 
knowledge.  Cognitive  skill  acquisition  involves  the  formulation  of  thousands  of  rules 
relating  task  goals  and  task  states  to  actions  and  consequences.  The  theory  employs  a 
production-rule  formalism  to  represent  this  goal-oriented  knowledge.  For  example,  a 
geometry  proof  generation  mle  might  be: 

IF  the  goal  is  to  prove  two  triangles  are  congruent 
THEN  set  as  a  subgoal  to  prove  that  corresponding  parts  are  congruent. 

while  a  LISP  programming  rule  would  be: 

IF  the  goal  is  to  get  the  second  element  of  the  list 
THEN  code  car  and  set  as  a  subgoal  to  pass  to  car  an  argument  which  is  the  tail  of 
the  list 

It  is  not  the  production  rule  notation  that  is  critical;  it  is  the  set  of  representational  features 
that  this  notation  enables.  For  instance,  production  rules  are  procedural,  abstract,  modular, 
directional,  and  goal  related.  See  Anderson  (1993,  Chapter  2  and  especially  Section  2.4) 
for  an  elaboration  of  these  features. 

A  theory  of  the  acquisition  of  cognitive  skills  should  have  implications  for  their  instruction. 
We  thought  it  would  be  an  important  test  of  the  theory  if  we  could  use  it  to  optimize 
learning.  However,  it  is  not  a  trivial  matter  to  convert  a  scientific  theory  of  a  phenomenon 
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to  an  engineering  theory  of  how  to  foster  that  phenomenon.  We  undertook  our  first  work 
in  intelligent  tutoring  to  explore  how  such  a  conversion  might  take  place. 


The  ACT  Theory 

Since  the  tutoring  effort  is  so  strongly  tied  to  the  ACT  theories  of  skill  acquisition  (initially, 
the  ACT*  theory — ^Anderson,  1983 — and  now  the  ACT-R  theory — ^Anderson,  1993)  it  is 
worth  reviewing  the  principal  tenets  of  that  theory: 

1.  Procedural'DecIarative  Distinction:  The  theory  distinguishes  between 
declarative  knowledge  (e.g.,  knowing  the  side-angle-side  theorem)  and  procedural 
knowledge  (e.g.,  an  ability  to  use  the  side-angle-side  theorem  in  a  proof).  The 
assumption  of  the  theory  is  that  goal-independent  declarative  knowledge  initially 
enters  the  system  in  a  form  that  can  be  encoded  more  or  less  directly  from 
observation  and  instruction.  Cognitive  skill  depends  on  converting  this  knowledge 
into  production  rules  like  the  above  which  represent  the  procedural  knowledge. 

2.  Knowledge  Compilation:  It  is  assumed  that  the  students  could  use  various 
interpretative  procedures  such  as  instruction-following  and  analogy  to  generate 
problem-solving  behavior  by  relating  declarative  knowledge  to  task  goals.  A 
learning  process  called  knowledge  compilation  converts  this  interpretive  problem 
solving  into  production  rules.  Thus,  the  theory  assumes  that  production  rules  can 
only  be  learned  by  employing  declarative  knowledge  in  the  context  of  a  problem¬ 
solving  activity. 

3.  Strengthening:  It  is  assumed  that  both  declarative  and  procedural  knowledge 
acquire  strength  with  practice.  Application  of  weak  knowledge  can  result  in  slips 
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and  errors.  Thus,  even  after  the  knowledge  has  been  successfully  encoded  further 
practice  produces  smoother,  more  rapid,  and  less  errorful  execution. 

These  three  assumptions^  pointed  us  in  the  direction  of  a  method  of  instruction  in  which 
students  were  presented  with  an  initial  brief  declarative  instruction  and  then  they  received 
good  deal  of  guided  practice.  As  stressed  elsewhere  (Anderson,  Conrad,  &  Corbett,  1989; 
Anderson  1993)  this  conception  of  skill  acquisition  is  quite  simple.  The  apparent 
complexity  of  learning  a  cognitive  skill  results  froifi  the  inherent  complexity  of  the  domain 
being  learned.  That  complexity  is  reflected  in  the  complexity  of  the  rule  set  that  has  to  be 
learned  but  the  learning  of  each  production  rule  is  quite  simple. 


The  Nature  of  a  Cognitive  Skill 

We  have  already  used  the  term  “cognitive  skill”  and  will  continue  throughout  this  paper  to 
use  the  term  to  describe  what  our  tutors  teach.  Therefore,  it  seems  important  to  be  clear  on 
how  the  term  relates  to  what  one  might  conceive  of  as  “competence”  in  a  particular  domain 
such  as  geometry.  We  use  “cognitive  skill”  to  refer  to  the  set  of  production  rules  acquired 
in  the  domain.  According  to  the  ACT  theory,  there  is  more  to  domain  competence  than  just 
these  production  rules.  There  is  also  the  declarative  structures  that  represent  domain 
knowledge.  While,  in  principle,  it  is  possible  to  have  of  all  domain  knowledge  represented 
in  production  rules  or  all  domain  knowledge  represented  declaratively  (and  being 
interpreted  by  domain-independent  procedures)  we  do  not  think  either  is  the  profitable  way 
to  develop  domain  competence.  If  everything  had  to  be  represented  as  production  rules, 
too  many  rules  would  be  required  because  it  would  be  necessary  to  represent  each  piece  of 
knowledge  in  each  way  it  could  be  used.  If  everything  had  to  be  interpreted  from 


^There  were  other  ideas  in  the  ACT*  thewy  about  production  rule  learning  but  these  were  abandoned  in  the 
ACT-R  theory  which  is  in  many  ways  a  simplification  of  the  earlier  theory. 
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declarative  representations  it  would  be  too  inefficient  and  place  too  great  a  burden  on 
working  memory. 

An  example  of  a  declarative  structure  in  the  domain  of  geometry  might  be  the  side-angle- 
side  theorem.  “If  two  sides  and  the  included  angles  of  two  triangles  are  congruent,  then  the 
triangles  are  congruent.”  Procedural  knowledge  might  involve  skills  of  placing  triangles 
into  correspondence,  determining  what  an  included  angle  is,  setting  subgoals,  and  making 
inferences.  It  might  also  include  some  frequently  encountered  uses  of  this  rule  such  as 
recognizing  triangles  as  congruent  which  meet  this  condition. 

Given  that  competence  depends  on  both  declarative  and  procedural  knowledge  why  have 
we  placed  the  emphasis  on  the  procedural?  This  is  because  our  view  is  that  the  acquisition 
of  the  declarative  knowledge  is  relatively  unproblematical.  However,  declarative 
knowledge  by  itself  is  inert  and  often  quite  useless.^  Declarative  knowledge  can  be 
acquired  by  simply  being  told  and  our  tutors  always  apply  in  a  context  where  students 
receive  such  declarative  instruction  external  to  the  tutors.  What  is  problematical  is 
acquiring  the  procedural  knowledge  that  enables  this  inert  knowledge  to  become  the  basis 
for  effective  action  in  the  context  of  use.  Production  rules  cannot  be  learned  by  simply 
being  told.  Rather  they  are  skills  that  are  only  acquired  by  doing.  Thus,  it  is  critical  to  set 
up  contexts  in  which  these  skills  can  be  displayed,  monitored,  and  appropriate  feedback 
given  to  shape  their  acquisition.  This  is  the  function  of  our  tutors. 

Initial  Work  on  Tutoring 


^While  our  theoretical  interpretation  of  the  phenomenon  is  different,  this  issue  of  inert  knowledge  is,  of 
course,  a  familiar  problem  in  education  (e.g..  Brown,  Collins,  &  Doguid,  1989;  Cognition  and  Technology 
Group  at  Vanderbilt,  1990;  and  Whitehead,  1929). 
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Our  initial  motivation  in  developing  intelligent  tutoring  systems  was  mainly  to  learn  more 
about  skill  acquisition  rather  than  to  produce  practical  classroom  results.  It  was  a 
significant  test  of  the  ACT  theory  to  see  whether  we  could  produce  successful  learning  by 
getting  students  to  act  like  the  underlying  production-rule  model.  It  was  by  no  means 
obvious  at  the  time  whether  or  not  there  were  going  to  be  major  gaps  in  ACT  production- 
rule  models  when  applied  to  such  instrucdonal  situations. 

In  1983  work  began  on  a  LISP  tutor  and  a  geometry  tutor  (Anderson  &  Reiser,  1985; 
Anderson,  Boyle  &  Yost,  1986;  Anderson,  Boyle,  Corbett  &  Lewis,  1990).  The  former 
helped  students  write  short  programs  in  LISP,  while  the  latter  helped  students  search  for 
geometry  proofs  and  represent  them  in  proof-graph  form.  The  screen  displays  for  the  two 
tutors  are  depicted  in  Figures  1  and  2.  These  two  tutors  embodied  a  number  of  key  ideas 
about  how  computer-based  instruction  should  be  realized.  These  ideas  have  been  part  of  all 
of  our  subsequent  tutors; 


Insert  Figures  1  and  2  About  Here 

(1)  Model:  There  should  be  a  production  rule  model  of  the  underlying  skill  incorporated 
into  the  tutor.  This  is  a  model  which  would  perform  the  task  the  student  was  expected  to 
perform.  At  each  point  in  the  problem  solving  the  model  is  capable  of  generating  a  set  of 
production  sequences  which  represent  correct  solutions  of  the  problem. 

(2)  On-Path  Actions:  Correct  actions  on  the  student’s  part  are  recognized  if  they  are 
along  one  of  the  correct  solution  paths  generated  by  the  model.  If  the  student  is  correct,  the 
tutor  does  not  comment  but  rather  allows  the  student  to  progress  with  the  solution. 
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(3)  Off-Path  Actions:  If  the  student  performs  an  off-path  action,  instruction  is  focused 
on  getting  the  student  back  on  path.  Our  earlier  tutors  required  students  to  always  stay  on 
path.  More  recent  tutors  allow  the  student  to  go  off  path  but  still  focus  instruction  on 
getting  the  student  back  on  path  when  they  are  off  path. 

(4)  Error  Feedback  and  Help:  The  tutors  possess  two  types  of  instruction.  If  the 
student  makes  a  recognizable  error  (a  bug)  a  message  can  be  given  explaining  why  it  is  an 
error.  This  is  generated  from  a  buggy  production  that  embodies  the  error.  If  the  student 
asks  for  help,  a  help  message  is  presented  to  guide  the  student  to  the  correct  solution.  This 
message  is  generated  from  the  information  along  a  correct  path.  Both  bug  messages  and 
help  messages  are  generated  to  be  specific  to  the  particular  context  in  which  they  occur  by 
using  the  particular  instantiations  of  the  general  production  rules. 

This  approach  to  tutoring  is  described  as  the  model-tracing  approach  because  it 
involves  trying  to  relate  the  behavioral  manifestations  of  the  student's  solution  on  the 
computer  to  some  sequence  of  production  firings  in  the  cognitive  model.  This  is  a  version 
of  the  plan-recognition  problem  which  is  recognized  as  being  computationally  very  difficult 
in  its  general  form  because  of  the  combinatorics  of  how  a  plan  can  fit  onto  external 
behavior.  We  originally  dealt  with  this  problem  by  insisting  that  each  action  of  the  student 
be  on  an  interpretable  path.  When  there  was  any  ambiguity  about  the  interpretation  of  the 
student’s  action  the  student  was  presented  with  a  disambiguation  menu  to  identify  the 
proper  interpretation  of  the  action.  If  the  student’s  action  was  in  error,  the  student  was  to 
correct  it  and  get  back  on  an  interpretable  path.  This  approach,  combined  with  an  interface 
that  yielded  a  relatively  rich  behavioral  trace  and  a  restriction  on  possible  interpretations  of 
the  behavior,  tamed  the  combinatorics  of  the  problem  so  that  we  were  able  to  follow  the 
solution  of  the  student  and  do  so  in  real  time.  As  we  will  discuss  later  in  this  manuscript. 
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we  have  subsequently  relaxed  the  requirement  that  the  student  stay  on  an  interpretable  path 
but  have  done  so  in  ways  that  avoided  the  potential  combinatorial  explosion. 

This  technical  accomplishment  was  no  mean  feat  in  itself.  It  was  and  still  is  the  only 
practical  automatic  approach  to  protocol  analysis.  This  model-tracing  approach  has  been 
adapted  to  doing  automatic  protocol  analyses  of  problem  solving  in  psychology 
experiments  where  there  is  no  tutorial  intervention  (Anderson,  1993).  However,  solving 
the  technical  problem  of  model  tracing  does  not  bring  with  it  any  automatic  guarantee  of 
instructional  effectiveness. 

Interacting  with  the  LISP  Tutor 

We  first  need  to  illustrate  what  it  is  like  to  interact  with  one  of  these  tutors.  Since  we  report 
more  empirical  research  from  the  LISP  tutor  than  any  from  the  others,  it  is  the  best  choice 
for  an  illustration.  This  section  will  describe  an  interaction  with  the  original  LISP  tutor  we 
created.^  Figure  1  depicts  the  terminal  screen  at  the  beginning  of  an  exercise.  The  original 
LISP  tutor  ran  on  Vaxes  and  communicated  with  students  via  a  regular  24  x  80  character 
terminal.  The  screen  is  divided  into  two  windows,  and  the  problem  description  appears  in 
the  "tutor  window"  at  the  top  of  the  screen.  As  the  student  types,  the  code  appears  in  the 
"code  window"  at  the  bottom  of  the  screen.  This  exercise  is  drawn  from  the  lesson  in 
which  iteration  is  being  introduced.  Students  are  familiar  with  the  structure  of  function 
definitions  by  this  point,  so  the  tutor  has  put  up  the  template  for  a  definition,  filling  in 
defun  and  the  function  name  for  the  student.  The  symbols  in  angle  brackets  reify 
remaining  goals — that  is,  they  represent  code  components  remaining  for  the  student  to 
supply.  The  tutor  places  the  cursor  over  the  first  symbol  the  student  needs  to  expand, 
<PARAMETERS>. 

‘*As  it  turns  out,  there  are  a  number  of  different  LISP  tutors  that  we  have  constructed.  The  original  tutor  is 
called  affectionately  "LISP  Tutor  classic"  in  our  laboratory. 
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As  the  student  works  on  an  exercise,  this  tutor  monitors  the  student's  input,  essentially  on 
a  symbol-by-symbol  basis.  As  long  as  the  student  is  on  some  reasonable  solution  path,  the 
tutor  remains  in  the  background  and  the  interface  behaves  like  a  structure  editor.  The  tutor 
expands  templates  for  function  calls,  provides  balancing  right-parentheses,  and  advances 
the  cursor  over  the  remaining  symbols  which  must  be  expanded.  If  the  student  makes  a 
mistake,  however,  the  tutor  immediately  provides  feedback  and  gives  the  student  another 
opportunity  to  type  a  correct  symbol.  If  the  student  requests  an  explanation  or  if  the 
student  appears  to  be  floundering^,  the  tutor  will  also  provide  a  correct  next  step  in  a 
solution,  along  with  an  explanation. 

Table  1  contains  a  record  of  a  hypothetical  student  completing  the  code  for  the  exercise.^ 
This  table  does  not  attempt  to  show  the  terminal  screen  as  it  actually  appears  at  each  step  in 
the  exercise.  Instead,  it  shows  an  abbreviated  "teletype"  version  of  the  interaction.  As 
described  above,  while  the  student  is  working,  the  problem  description  generally  remains 
in  the  tutor  window  (except  when  a  message  to  the  student  is  being  presented),  while  the 
code  window  is  being  updated  on  a  symbol-by-symbol  basis.  Instead  of  portraying  each 
update  to  the  code  window  in  the  interaction,  the  table  portrays  nine  key  "cycles"  in  which 
the  tutor  interrupts  to  communicate  with  the  student.  At  each  of  these  enumerated  cycles 
the  complete  contents  of  the  code  window  are  shown,  along  with  the  tutor's  response.  The 
tutor's  response  is  shown  below  the  code  to  capture  the  temporal  sequence  of  events;  on 
the  terminal  screen,  the  tutor's  communications  would  appear  in  the  tutor  window  above 
the  code.  In  each  cycle  all  the  code  which  the  student  has  typed  since  the  preceding  cycle  is 


^Students  are  judged  lo  be  floundering  at  a  step  in  the  solution  when  they  repeat  the  same  type  of  eiror  three 
times  or  make  two  mistakes  that  the  tutor  does  not  recognize. 

^Undoubtedly,  the  use  of  LISP  creates  a  barrier  to  communication  with  that  fraction  of  the  readership  that  is 
not  familiar  with  LISP.  However,  the  semantics  of  LISP  ate  not  really  necessary  to  understanding  how  the 
tutor  interacts  ot  how  these  interactions  depend  on  the  underiying  production-rule  models. 
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shown  in  boldface  in  Table  1.  However,  in  each  case,  the  tutor  is  responding  specifically 
to  the  last  symbol  the  student  typed. 


Insert  Table  1  About  Here 


In  the  first  of  the  cycles  displayed,  the  student  has  typed  in  the  parameter  list  and  has  called 
loop  in  order  to  iterate.  The  tutor  reminds  the  stijdent  that  it  is  necessary  to  create  some 
local  variables  before  entering  the  loop.  In  the  second  cycle,  the  student  has  called  let  and 
is  about  to  create  a  local  variable.  The  template  for  numeric  iteration  calls  for  two  local 
variables  in  this  function,  so  the  tutor  puts  up  a  menu  to  clarify  which  variable  the  student 
is  going  to  declare  first.  This  illustrates  the  tutor’s  need  to  know  at  all  times  what  the 
student’s  intentions  are  so  that  it  can  follow  the  student.  If  there  is  an  ambiguity,  it  will 
query  the  student  by  the  means  of  such  menus.  In  the  third  cycle,  the  student  has  coded  an 
initial  value  which  would  be  correct  if  the  function  were  going  to  count  up.  However,  this 
exercise  is  intended  to  give  the  student  practice  in  counting  down,  so  the  tutor  redirects  the 
student.  In  the  fourth  cycle,  the  student  has  made  a  typing  error  which  the  tutor 
recognizes,  and  in  the  fifth  cycle  the  student  is  attempting  to  return  the  correct  value  from 
the  loop,  but  has  forgotten  to  call  return.  The  tutor  reminds  the  student  that  a  special 
function  call  is  required  to  exit  a  loop.  The  interactions  between  the  tutor  and  student 
continue  in  this  manner.  Note  that,  for  illustration’s  sake,  this  interaction  shows  students 
making  rather  more  errors  than  they  usually  do.  Typically,  the  error  rate  is  about  15 
percent  while  it  is  approximately  30  percent  in  this  dialogue.  After  each  exercise,  the 
student  enters  a  standard  LISP  environment  called  the  LISP  window.  Students  can 
experiment  in  the  LISP  window  as  they  choose;  the  only  constraint  is  that  they  successfully 
call  the  funrtion  they  have  just  defined  (which  the  tutor  has  automatically  loaded). 
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The  Initial  Incursion  into  the  Classroom 

In  1984  we  ran  a  few  high-school  students  through  the  geometry  tutor  and  taught  a  mini- 
course  in  Computer  Science  at  CMU  with  the  LISP  tutor.  The  results  exceeded  our  initial 
modest  expectations.  Students  seemed  to  learn  fairly  well  with  the  geometry  tutor.  The 
LISP  mini  course  was  broken  up  into  two  groups  to  allow  for  an  evaluation.  One  group 
worked  exercises  with  the  LISP  tutor  and  one  worked  the  same  exercises  in  a  standard 
LISP  environment.  Students  with  the  LISP  tutor  took  30%  less  time  and  scored  one 
standard  deviation  higher  on  a  final  test  than  the  students  in  the  control  condition. 

In  response  to  these  results  we  created  a  full  semester  course  in  LISP,  taught  to  Humanities 
and  Social  Sciences  students,  which  is  still  a  successful  course  today  but  with  a  number  of 
revisions  over  the  period.^  We  decided  to  try  to  use  the  geometry  tutor  in  a  real  classroom 
and  were  able  to  create  a  classroom  of  10  Xerox  D-machines  which  were  in  Peabody  High 
School  in  Pittsburgh  from  1985  to  1988. 

Emboldened  by  the  prospects  of  practical  tutors,  we  set  out  to  create  a  third  tutor  for 
Algebra  I.  That  tutor  was  also  implemented  on  the  D-machine.  It  tutored  algebra  symbol- 
manipulation  skills  (such  as  solving  linear  equations)  which  had  been  the  focus  of  some 
discussion  in  cognitive  science  (e.g.,  Matz,  1982). 

Figure  3  illustrates  the  appearance  of  the  algebra  tutor  (for  more  details  read  Milson,  Lewis, 
&  Anderson,  1990).  Our  analysis  of  the  algebra  symbol  manipulation  skills  revealed  that 
there  were  clear  algorithms  for  accomplishing  all  tasks  and  successful  students  had  learned 
these  algorithms.  The  algorithms  were  described  in  more  or  less  detail  in  various 
textbooks.  A  key  feature  of  these  algorithms  was  that  they  were  quite  hierarchical  such  that 

^Currently,  the  course  is  now  a  combined  LISP  and  Prolog  course  in  which  the  students  learn  both 
languages.  The  tutor  is  now  delivered  on  a  Macintosh  system. 
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to  solve  a  linear  equation  one  might  have  to  get  rid  of  embedded  expressions  which 
required  distribution  which  might  require  multiplication  of  fractions  which  required  integer 
multiplication  and  so  on.  We  observed  that  weaker  students  were  frequently  confused 
about  a  lower-level  operation  and  it  was  difficult  to  get  them  to  identify  the  level  at  which 
their  problem  occurred.  To  facilitate  this  remediation  we  attempted  to  illustrate  this 
hierarchical  structure  by  a  representation  like  Figure  3  which  placed  boxes  representing 
suboperations  within  boxes  representing  larger  operations.  This  would  allow  us  to  identify 
and  focus  instruction  on  the  level  at  which  the  student  was  having  difficulties. 

Insert  Figure  3  About  Here 


The  Eight  Principles 

About  that  time  it  seemed  that  we  should  try  to  draw  a  tighter  connection  between  the  ACT* 
theory  and  our  tutoring  practice.  Anderson,  Boyle,  Farrell,  and  Reiser  (1987)  examined 
the  ACT*  theory  and  extracted  what  we  felt  were  eight  principles  for  design  of  tutors  which 
followed  from  that  theory  and  which  are  reviewed  below: 

Principle  1:  Represent  student  competence  as  a  production  set.  The 
fundamental  insight  is  that  the  tutoring  enterprise  should  be  informed  by  an  accurate  model 
of  the  target  skill.  The  cognitive  model  allows  us  to  set  appropriate  curriculum  objectives 
and  to  properly  interpret  the  actions  of  the  student.  This  has  been  the  essential  difference 
between  our  approach  and  the  more  behaviorist  approaches  to  computer-based  instruction. 
The  production  rules  define  a  more  abstract  and,  we  believe,  more  accurate  representation 
of  the  target  skill  than  did  the  behavioral  objectives  of  a  typical  behaviorist  analysis  (e.g. 
Bunderson  &  Faust,  1976;  Gagn6  &  Briggs,  1974).  However,  our  approach  shares  with 
the  behaviorist  approach  the  idea  of  decomposing  a  skill  into  components  and  organizing 
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instructions  according  to  the  componential  analysis.  The  difference  is  in  what  the 
components  are. 

This  principle  does  not  specify  how  to  define  a  computer  interface,  how  to  interact  with  a 
student,  or  when  to  promote  the  student  through  the  curriculum.  This  all  depends  on  a 
theory  of  how  such  production  rules  are  acquired.  The  other  principles  of  tutoring  were  all 
concerned  with  how  to  take  this  first  insight  and  convert  it  into  pedagogical  policy.  These 
other  principles  were  derived  with  varying  degrees  of  rigor  from  the  ACT  theory  of  skill 
acquisition. 

Principle  2:  Communicate  the  goal  structure  underlying  the  problem 
solving.  One  of  the  enduring  assumptions  of  the  ACT  theory  has  been  that  solving  a 
problem  involves  decomposing  that  problem  into  a  set  of  goals  and  subgoals.  Another 
observation  was  that  in  many  domains  which  students  had  difficulty  mastering  (e.g.,  proof 
skills  in  geometry  or  writing  recursive  programs)  the  goal  structure  governing  the  problem 
solving  was  not  adequately  communicated  to  the  student.  So  the  reasonable  assumption 
was  that  exposing  and  communicating  such  goals  should  be  an  instructional  objective.  We 
adapted  an  approach  that  has  been  called  reification  (Brown,  1985;  Collins  &  Brown, 
1987).  We  attempted  to  develop  interfaces  that  made  explicit  the  goal  structures  which 
were  only  implicit  in  the  instruction.  We  had  at  least  two  notable  successes.  This  was  the 
use  of  a  proof  graph  in  geometry  to  illustrate  the  subgoaling  relationship  between  certain 
conclusions  and  the  ultimate  conclusion  of  the  proof.®  The  other  was  Singley’s  use  of  a 
subgoal  tree  to  illustrate  use  of  the  chain  rule  in  related-rates  calculus  problems  (Singley, 
1986). 


®For  an  evaluation  of  the  contribution  of  the  proof  graph  over  and  above  any  tutoring  see  Schemes  &  Sieg 
(in  press). 
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Principle  3:  Provide  instruction  in  the  problem  solving  context.  This 
principle  was  based  on  the  research  showing  the  context-specificity  of  learning  (e.g., 
Anderson,  1990;  Ch.  7).  The  current  situated  learning  movement  (e.g.,  Collins,  Brown, 
&  Newman,  1989;  Lave  &  Wenger,  1990)  presumably  gives  a  new  currency  to  this 
principle.  The  difficulty  with  this  principle  is  that  there  is  not  a  detailed  theoretical 
interpretation  of  why  it  is  true  and  so  it  is  a  little  hard  to  know  how  to  apply  it  in  detail. 
Does  this  mean  provide  instruction  in  the  same  class  session  as  the  tutor  is  used?  before 
each  problem?  in  the  midst  of  each  problem?  As  it  has  evolved  in  our  applications  this  has 
come  to  mean  providing  instruction  between  each  new  section  in  the  tutor  (a  section  is 
where  new  production  rules  are  introduced)  allowing  the  student  to  refer  back  to  this 
instruction  in  the  course  of  problem  solving.  We  have  experimented  with  placing 
instruction  at  the  precise  point  where  it  is  needed  in  a  problem  but  students  find  this 
interferes  with  their  problem  solving. 

Principle  4:  Promote  an  abstract  understanding  of  the  problem-solving 
knowledge.  This  principle  was  motivated  by  the  observation  that  students  will  often 
develop  overly  specific  knowledge  from  particular  problem-solving  examples.  In  terms  of 
production  rules  this  has  meant  that  the  conditions  on  the  rules  were  not  sufficiently 
general.  While  the  problem  is  undoubtedly  real  this  principle  provides  no  guidance  for 
how  it  is  to  be  achieved.  In  practice  we  tried  to  reinforce  the  correct  abstractions  in  the 
language  of  our  help  and  error  messages. 

Principle  5:  Minimize  Working  Memory  Load.  This  principle  was  motivated  by 
the  fact  that  learning  a  new  production  rule  in  ACT  requires  that  all  the  relevant  information 
(relevant  to  the  condition  and  action  of  the  to-be-leamed  production)  be  simultaneously 
active  in  memory.  Keeping  other  information  active  could  potentially  interfere  with 
learning  the  target  information.  Sweller  (1988)  has  shown  that  a  high  working-memory 
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load  interferes  with  learning.  This  principle  means  minimizing  presentation  and  processing 
of  information  not  relevant  to  the  target  pnxiuctions.  This  includes  minimizing  presentation 
of  instruction  while  problem  solving  since  processing  this  instruction  poses  another 
working-memory  load. 

This  also  implies  that  one  should  try  to  provide  instruction  on  speciHc  components  only 
when  other  components  of  the  skill  have  already  been  relatively  well  mastered.  This  leads 
to  a  curriculum  design  in  which  only  a  few  new  things  are  taught  at  a  time.  This  could  be 
viewed  as  being  at  odds  with  the  current  approaches  such  as  cognitive  apprenticeship  or 
anchored  instruction,  which  advocate  teaching  component  skills  in  the  context  of  complex, 
real-world  problems.  However,  this  approach  does  not  deny  the  value  of  learning  in  such 
context  but  rather  argues  that  students  should  gradually  acquire  the  skills  required  to  deal 
with  this  complexity  rather  than  having  to  acquire  them  all  at  once. 

Principle  6:  Provide  immediate  feedback  on  errors.  This  clearly  has  been  the 
most  controversial  of  our  tutoring  principles.  The  ACT*  theory  claimed  that  new 
productions  were  created  from  records  of  problem-solving  traces.  Therefore,  the  longer 
one  waited  until  an  error  was  corrected  the  longer  the  span  of  problem  solving  over  which 
the  student  would  have  to  integrate  to  create  a  production.  The  current  ACT-R  theory 
claims  that  one  learns  from  problem-solving  products.  Thus,  the  learner  examines  the 
resulting  solution  (code,  proof,  algebraic  derivation)  and  builds  productions  from  that. 
Thus,  it  does  not  matter  whether  all  the  critical  steps  occur  together  in  time  or  not — only 
that  they  be  represented  in  the  final  solution.  Thus,  the  principal  theoretical  justification  for 
immediate  feedback  no  longer  exists  in  ACT-R.  We  will  later  discuss  evidence  about 
immediacy  of  feedback  from  our  tutors  which  is  consistent  with  the  current  ACT-R 
conception.  Still,  we  will  see  that  immediate  feedback  can  be  beneficial  in  cutting  down  on 
time  spent  in  error  states  and  making  it  easier  to  interpret  the  student's  problem  solving. 
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Principle  7;  Adjust  the  grain  size  of  instruction  with  learning.  This  principle 
was  motivated  by  the  composition  learning  operator  in  ACT*  which  claimed  that  single 
productions  would  be  composed  into  larger  productions  which  did  in  one  cognitive  step 
what  had  been  done  in  many  steps.  While  ACT-R  does  not  have  such  a  composition 
learning  operator  it  still  predicts  this  change  in  the  grain  size  of  problem  solution  but  from 
other  mechanisms  (Anderson,  1993,  Ch.  4).  Thus,  it  seemed  reasonable  to  design  the 
interface  so  that  one  could  process  the  student’s  t)roblem  solving  in  ever  larger  units  of 
analysis.  There  has  only  been  one  early  attempt  to  do  this,  however,  and  this  was  with  the 
algebra  tutor  (Anderson,  Boyle,  Corbett,  &  Lewis,  1990).  That  attempt  was  not  notably 
successful.  In  retrospect,  our  problems  here  reflected  some  fundamental  misconceptions 
about  the  role  of  the  interface  in  problem  solving.  This  is  a  topic  which  will  be  discussed  at 
length  later  in  the  paper. 

Principle  8.  Facilitate  Successive  Approximations  to  the  Target  Skill: 
Frequently,  when  students  are  initially  trying  to  perform  a  skill,  they  cannot  perform  all  the 
steps.  We  had  the  tutor  fill  in  the  missing  steps.  The  expectation  was  that  with  repeated 
practice  this  division  of  labor  between  student  and  tutor  would  change  with  the  student 
providing  more  and  more  of  the  work  until  the  tutor  was  completely  in  the  background.  In 
practice  this  successive  approximation  has  frequently  worked  quite  well.  This  principle 
seems  quite  analogous  to  “fading”  in  the  cognitive  apprenticeship  terminology  (Collins, 
Brown,  &  Newman,  1990). 

Some  of  these  principles  are  similar  to  ideas  that  accompanied  more  behaviorist  attempts  at 
instructional  design  (Bunderson  &  Faust,  1976;  Gagn6  &  Briggs,  1974).  This  is 
particularly  true  for  principles  3,  6, 7,  and  8  above.  The  difference  is  that  these  principles 
were  being  used  in  service  of  a  different  representation  of  the  underlying  skill.  The  places 
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where  these  principles  add  something  to  the  standard  behaviorist  approach  (principles  1, 2, 
4,  and  5  above)  reflect  the  different  representational  assumptions.  This  is  a  case  where 
assumptions  about  knowledge  representation  matter. 

The  fact  that  our  tutors  embody  cognitive  models  of  the  target  competence  does  not  need 
imply  that  they  would  always  behave  differently  than  instructional  systems  based  on 
behaviorist  principles.  It  depends  on  the  domain.  If  we  were  building  a  spelling  tutor  with 
the  goal  of  memorization,  we  suspect  it  would  be  much  like  behaviorist  applications  (e.g.. 
Porter,  1961)  which  produce  similar  achievement  gains  as  do  our  systems.  However,  we 
have  chosen  to  focus  our  applications  on  much  more  complex  skills  where  our  cognitive 
models  do  lead  to  different  instructional  strategies.  It  is  our  impression  that  the  behaviorist 
programs  have  not  had  much  success  in  extending  to  such  complex  domains. 

Stage  2:  The  Evaluations  and  Empirical  Studies 


Anderson,  Boyle,  Corbett,  and  Lewis  (1990)  report  the  state  of  the  tutoring  work  in  1987, 
including  the  results  of  the  first  phase  of  research  activity.  This  section  reviews  those 
results  and  brings  the  research  record  up  to  date.  The  first  three  sections  describe 
summative  evaluations  of  the  geometry  tutor,  algebra  tutor,  and  LISP  tutor.  Succeeding 
sections  discuss  evidence  on  the  componential  nature  of  skill  acquisition,  student  modeling, 
and  feedback  control  and  content. 

The  Geometry  Tutor 

The  geometry  tutor  was  used  in  a  pilot  study  in  the  1985-1986  school  year.  A  number  of 
classes  were  exposed  to  it  and  all  showed  large  achievement  gains.  The  1986-1987  school 
year  was  the  major  test  where  we  compared  classes  with  the  tutor  with  classes  without  the 
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tutor  but  with  the  same  teacher.  We  performed  a  number  of  regression  analyses  trying  to 
predict  student  performance  in  a  final  paper  and  pencil  test  of  proof  skills.  The  following 
equation  predicted  student  performance  on  a  scale  from  0-80: 

35  +  7.5*  (letter  grade  in  algebra) 

+  14  if  access  to  tutor  one-on-one 
4  if  access  to  tutor  two-on-one 

The  student’s  letter  grade  in  the  prior  year’s  algebra  class  (1  =  D,...,4  =  A)  was  the  best 
measure  of  prior  individual  student  differences  in  predicting  geometry  test  performance 
(better  than  IQ,  for  instance).  The  14  points  for  the  tutor  reflected  more  than  one  standard 
deviation  in  the  population  or  more  than  one  letter  grade  on  the  test.  Because  we  did  not 
have  enough  machines,  sometimes  pairs  of  students  worked  on  the  machines.  In  this  case, 
most  of  the  tutor  benefit  was  eliminated  and  the  remaining  4  point  advantage  of  these 
students  was  not  statistically  different  from  the  control  group. 

In  addition  to  our  own  assessment  of  the  tutor  there  have  appeared  reports  from  third  party 
observers  (Schoefield  &  Evan-Rhodes,  1989)  and  the  teacher  (Wertheimer,  1990) 
confirming  the  large  positive  impact  of  the  tutor  on  the  classroom.  Schoefield  and  Evan- 
Rhodes  reported  there  were  large  improvements  in  the  motivation  of  students,  with 
students  spending  more  time  on  task.^  Wertheimer  (1990)  reported  that  he  found  the 
experience  satisfying  as  a  teacher  because  it  allowed  him  to  focus  on  the  specific  difficulties 
of  specific  students. 

A  more  recent  geometry  tutor  has  been  completed  based  on  the  cognitive  model  of 
geometry  proof  of  Koedinger  and  Anderson  (1990).  It  has  been  subject  to  a  preliminary 

^It  got  to  the  point  were  fights  were  occurring  among  students  for  access  to  the  tutor. 
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evaluation  (Koedinger  &  Anderson,  1993)  in  which  we  also  found  a  large  positive  result 
but  only  for  the  teacher  carefully  integrated  into  the  project.  Students  with  the  tutor  and  the 
project  teacher  averaged  just  over  5  proofs  correct  out  of  8  while  students  in  each  of  the 
other  three  conditions  (project  teacher  without  the  tutor,  tutor  without  non-project  teacher, 
or  non-project  teacher  without  tutor)  averaged  just  over  3  proofs  correct.  The  fact  that 
the  tutor  had  its  benefit  only  for  the  project  teacher  highlights  the  issue  of  integrating  the 
tutor  into  the  classroom. 

The  Algebra  Tutor 

An  evaluation  was  performed  of  the  algebra  tutor  in  the  1987-1988  school  year.  There 
were  no  differences  between  experimental  classes  which  had  access  to  the  tutors  and 
control  classes  which  did  not.  We  think  the  major  reason  for  the  lack  of  effect  was  that 
there  was  large  difference  between  the  tutor  interface  and  the  interface  used  in  class  (i.e., 
paper  and  pencil).  It  was  just  not  obvious  how  to  map  the  boxed  representation  of 
algorithmic  decompositions  (see  Figure  3)  to  the  linear  line-by-line  transformations  that 
were  used  to  assess  performance  in  the  paper-and-pencil  post  test.  A  less  important  reason 
relates  to  a  ninth  grade  algebra  class  in  an  urban  school.  Symbol  manipulation  is 
sufficiently  easy  that  some  students  were  mastering  the  skill  quite  well  without  tutorial 
intervention.  Other  students  were  just  not  involved  in  the  class  at  all  (often  not  attending) 
and  the  algebra  tutor  was  too  peripheral  a  part  of  their  experience  to  help  change  their 
general  pattern  of  behavior  towards  school. 

We  followed  up  the  algebra  tutor  with  a  word  algebra  tutor  which  had  some  large  positive 
results  in  the  laboratory  (Singley,  Anderson,  Gevins,  &  Hoffman,  1989)  although  it  was 

*®This  is  a  one  standard  deviation  difference. 

Subsequent  research  has  suggested  that  teachers  take  about  a  year  to  become  comfortable  with  the  tutor 
and  second  year  teachers  have  students  who  show  achievement  gains  with  the  tutor. 
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never  tested  in  the  classroom.  We  think  there  are  two  basic  reasons  for  the  success  of  the 
word  problem  tutor  in  the  laboratory.  First,  the  mapping  from  word  problems  in  the  tutor 
to  the  paper  and  pencil  post  test  was  obvious.  Second,  word  problems  are  in  fact  a  difficult 
topic  for  even  motivated  students  and  we  were  able  to  illuminate  their  instruction  with  our 
cognitive  model  for  their  solution.  We  are  currently  working  with  yet  a  newer  algebra 
word  tutor  which  is  being  used  in  the  Pittsburgh  Public  Schools.  Preliminary  evaluations 
again  suggest  significant  achievement  gains. 


The  LISP  Tutor 

The  LISP  tutor  was  evaluated  in  a  classroom  setting  early  in  its  development.  In  the  fall  of 
1984  we  taught  a  mini-course  on  LISP  in  which  students  attended  lectures  and  completed  a 
fixed  set  of  programming  exercises  either  with  the  tutor  or  in  a  standard  LISP  environment. 
Students  using  the  tutor  completed  the  exercises  30%  faster  and  performed  43%  better  on  a 
posttest.  We  performed  a  second,  laboratory  evaluation  which  more  closely  approximates 
the  current  self-paced  course  structure  (Corbett  &  Anderson,  1991).  Students  worked 
through  the  lessons  reading  the  same  text  and  trying  to  solve  the  same  set  of  exercises  with 
and  without  the  tutor.  In  this  evaluation  students  using  the  tutor  completed  the  exercises 
64%  faster  and  scored  30%  higher  on  posttests  than  students  using  a  standard  LISP 
environment.  As  will  be  elaborated  later,  we  think  that  the  only  reason  for  the  posttest 
difference  is  that  students  working  in  the  standard  non-tutor  environment  were  unable  to 
generate  working  solutions  to  all  the  exercises.  Conceivably,  if  students  in  the  control 
condition  had  put  in  sufficient  time  they  could  have  eventually  found  working  solutions  and 
scored  as  well  on  the  posttest.  Nonetheless,  this  study  supports  the  claim  that  a  well- 
designed  tutor  can  bring  students  to  as  high  or  higher  achievement  levels  in  no  more  than 
one-third  the  time  required  by  traditional  learning  environments. 
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A  more  typical  practice  in  education  evaluation  is  to  hold  learning  time  constant  and 
examine  differences  in  achievement  scores.  This  is  what  we  were  forced  to  do  in  our 
algebra  and  geometry  evaluations  because  we  could  not  manipulate  the  time  students  spent 
on  these  problems.  The  achievement  differences  are  always  a  little  hard  to  judge — what 
does  14  points  on  an  80  point  test  really  mean?  One  solution  is  to  report  differences  in 
achievement  level  measured  in  standard  deviations  (e.g.,  Bloom,  1984).  However, 
standard  deviation  differences  say  as  much  about  the  variance  in  test  scores  as  they  do 
about  the  impact  of  an  instructional  manipulation.  Such  variances  are  substantially  affected 
by  test  construction  and  inherent  variability  in  the  population.  Thus,  the  numbers  are 
virtually  meaningless  except  to  establish  the  direction  of  the  difference.  It  is  more 
meaningful  to  hold  constant  the  level  of  mastery  required  and  look  at  differences  in  time  to 
achieve  that  level.  This  reflects  the  true  gain  of  an  educational  technique. 

The  LISP  tutor  has  been  followed  up  with  a  general  programming  environment  in  which 
LISP,  Prolog,  and  Pascal  tutors  have  been  built  (Anderson,  Conrad,  Corbett,  Fincham, 
Hoffman,  &  Wu,  1993).  The  Pascal  tutor  has  just  started  to  be  used  in  the  public  schools. 
This  will  serve  as  a  basis  for  evaluation  outside  of  the  rather  specialized  CMU  population. 

Componential  Analysis  of  Learning 

One  of  the  current  controversies  in  cognitive  science  and  education  is  whether  it  is  possible 
to  take  a  complex  competence,  break  it  down  into  its  components,  and  understand  the 
learning  and  performance  of  that  competence  in  terms  of  the  learning  and  performance  of 
the  components  (e.g.,  Shepard,  1992).  When  we  have  addressed  this  question  in  the 
context  of  our  tutors  the  answer  is  a  resounding  yes. 
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Consider  the  question  of  predicting  how  quickly  and  accurately  a  specific  student  will 
generate  a  particular  piece  of  code  in  a  particular  LISP  program.  In  a  rather  exhaustive 
analysis  of  data  from  the  LISP  tutor,  Anderson,  Conrad,  and  Corbett  (1989)  concluded 
that  there  were  essentially  four  critical  factors: 

(1)  Production  practice.  The  first  factor  was  how  often  the  student  had  applied  the 
relevant  production  rule  earlier.  As  students  have  more  opportunities  to  use  a  production 
rule  across  exercises,  their  performance  on  the  rule^  improves.  Because  there  is  a  many-to- 
one  mapping  between  production  rules  and  surface  code  symbols  (e.g.,  car,  +,  write,  for, 
etc.)  in  different  contexts,  we  are  able  to  show  it  is  the  rule  and  not  the  surface  construct 
which  is  the  critical  unit  of  practice.  Similar  production-specific  learning  has  been  shown 
in  the  case  of  geometry  where  there  is  a  many-to-one  mapping  between  production  rules 
and  surface  rules  like  "side-angle-side"  (Anderson,  Bellezza,  &  Boyle,  1993).  In  the  ACT 
theory  this  is  attributed  to  strengthening  the  production  rule.  Figure  4  shows  learning 
curves  for  ‘new’  productions  being  introduced  in  a  LISP  lesson  and  ‘old’  productions 
introduced  in  previous  lessons.  As  can  be  seen  both  show  improvement  as  a  function  of 
amount  of  practice  within  the  lesson.  Old  productions  are  better  off  because  of  practice 
from  previous  lessons.  Anderson,  Conrad,  and  Corbett  also  showed  that  student 
performance  on  old  productions  in  a  new  lesson  starts  off  close  to  where  it  left  off  on  the 
previous  lesson  with  only  a  little  forgetting. 

(2)  Within-problem  practice  efi'ects.  In  both  LISP  and  geometry  we  were  able  to 
show  that  time  and  accuracy  for  rule  application  improves  as  the  student  progresses  further 
into  a  specific  problem,  partialing  out  any  effect  of  rule-specific  practice.  In  the  ACT 
theory  this  is  attributed  to  strengthening  of  the  declarative  representation  of  the  problem 
through  repeated  access. 
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(3)  Acquisition  factor.  In  a  factor  analysis  of  student  performance  we  found  that 
students  varied  in  how  well  they  performed  on  new  rules  that  were  introduced  in  a  lesson. 
The  real  significance  of  this  factor  is  unclear.  It  may  reflect  some  profound  individual 
differences  or  just  the  care  with  which  students  reviewed  the  material. 

(4)  Retention  factor.  The  same  factor  analysis  identified  students  who  did  well  in 
retaining  productions  from  earlier  lessons.  This  factor  was  largely  orthogonal  to  the 
acquisition  factor.  Again  the  real  significance  of  this  factor  is  uncertain.  It  could  again 
reflect  some  profound  individual  differences  or  just  how  much  students  reviewed  material 
between  lessons. 

The  upshot  of  this  analysis  is  the  following  scheme  for  predicting  how  well  a  student  will 
do  on  a  fragment  of  code:  First  determine  if  an  old  or  new  production  is  generating  that 
code.  If  it  is  a  new  production  one  needs  to  use  the  learning  curve  for  new  productions  to 
figure  out  the  within-lesson  practice  effects,  add  in  a  factor  to  represent  how  much  the 
student  has  worked  on  that  problem,  and  add  in  an  individual  difference  effect  to  reflect 
where  that  student  stands  on  the  acquisition  factor.  To  predict  performance  on  an  old 
production  one  adds  in  the  within-lesson  practice  effect  for  old  productions,  the  problem 
practice  effect,  and  an  effect  to  reflect  where  that  student  is  on  the  retention  factor.  As  far 
as  we  could  determine,  these  considerations  captured  the  predictable  variance. 

Knowledge  Tracing 

The  LISP  Tutor  had  a  student  modeling  facility  called  knowledge  tracing  shared  by 
neither  the  geometry  or  algebra  tutors.^2  a  student  worked  through  the  exercises  the 

J^Public  school  teachers  have  been  unwilling  to  allow  students  to  progress  at  their  own  rate  as  enabled  by 
the  knowledge-tracing  facility.  Recently  we  have  gotten  Pittsburgh  Public  School  teachers  to  accept  such 
individual  progress. 
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tutor  used  a  Bayesian  procedure  to  estimate  the  probability  that  the  student  had  learned  each 
of  the  rules  in  the  cognitive  model.  Knowledge  tracing  was  used  to  implement  a  form  of 
mastery  learning.  Students  were  given  sufficient  practice  in  each  section  of  the  curriculum 
to  bring  them  to  a  specified  degree  of  mastery  of  the  individual  cognitive  rules  introduced  in 
the  section  before  proceeding  to  the  next  section.  This  feature  has  substantial  impact  on 
student  achievement  level  (Anderson,  Conrad  and  Corbett,  1989).  Knowledge  tracing  is  a 
regular  feature  of  all  the  tutors  we  are  currently  developing  (see  the  section  on  practical 
deployment  in  this  paper). 

A  more  detailed  examination  of  the  knowledge  tracing  model  provided  further  confirmation 
of  the  componential  analysis  in  the  cognitive  model:  The  learning  and  performance  models 
that  underlie  knowledge  tracing  in  the  tutor  can  be  used  to  predict  posttest  performance. 
The  probability  that  a  student  will  solve  each  posttest  exercise  correctly  can  be  accurately 
derived  from  the  probability  that  the  student  has  learned  each  of  the  necessary  rules 
(Corbett  &  Anderson,  1992;  Corbett,  Anderson,  &  O’Brien,  in  press). 

These  results  have  strong  implications  for  instruction.  They  imply  that  we  should  be  able 
to  get  students  to  master  the  overall  skill  by  getting  them  to  master  the  individual 
components.  Numerous  analyses  have  reported  positive  results  for  mastery-based 
curricula  (e.g.,  Guskey  &  Gates,  1986;  Kulik,  Kulik,  &  Bangert-Downs,  1986)  although 
the  interpretation  of  these  results  is  not  without  controversy  (e.g.,  Anderson  &  Bums, 
1987;  Guskey,  1987;  Slavin,  1987).  Our  application  of  mastery  principles  is  different  than 
most  other  efforts  in  that  it  is  done  on  an  individual  student  basis  and  in  that  it  applies  to  the 
detailed  components  of  the  target  skill. 

Locus  of  Feedback  Control 
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As  prescribed  by  one  of  the  original  tutoring  principles,  our  tutors  conventionally  employ 
immediate  feedback  and  require  immediate  error  correction.  The  LISP  tutor  has  served  as 
the  vehicle  for  some  studies  that  differentially  distribute  the  control  of  feedback  and  the 
timing  of  error  correction  between  the  tutor  and  student.  We  developed  three  new  versions 
that  vary  widely  on  the  dimension  of  student/tutor  control.  At  the  far  extreme  from 
immediate  feedback  and  control,  we  created  a  version  which  provides  no  advice  on  how  to 
achieve  programming  goals.  Students  enter  their  code  with  a  structure  editor  and  have 
access  to  a  LISP  interpreter,  but  are  largely  on  their  own.  This  tutor  provides  just  one  bit 
of  information:  at  any  time  in  the  course  of  problem  solving  students  can  ask  whether  their 
solution  is  correct  (similar  to  checking  an  answer  at  the  back  of  the  book).  This  provides 
the  best  control  against  which  to  measure  the  effectiveness  of  our  tutors,  since  it  holds 
constant  type  of  problem-solving  interface,  the  non-tutor  instruction,  and  the  exercises 
attempted. 

The  remaining  two  versions  are  capable  of  providing  the  same  advice  as  the  standard 
immediate  feedback  version,  but  do  so  under  different  circumstances.  One  version,  which 
we  call  the  error-flagging  tutor,  falls  closer  to  standard  immediate  feedback.  This  tutor 
identifies  an  error  as  soon  as  it  occurs,  by  flagging  it  in  bold  font  on  the  screen,  but 
provides  no  feedback  message  and  does  not  require  immediate  correction.  The  student  can 
ask  for  a  feedback  message  (the  same  one  that  would  be  presented  automatically  in 
immediate  feedback),  can  try  to  fix  the  error  without  feedback,  or  can  continue  generating 
new  code  and  come  back  to  the  error  later.  We  call  the  other  version  the  demand  tutor 
because  it  provides  no  assistance  until  asked  by  the  student.  This  tutor  appears  like  the  no¬ 
feedback  control  version  as  the  student  works,  unless  the  student  asks  for  error  feedback. 
At  that  point  the  tutor  will  identify  the  first  error  in  the  code  (if  any)  and  provide  the  same 
message  as  appears  automatically  in  the  immediate  feedback  version.  In  both  of  these 
versions  the  student  can  ask  for  advice  on  how  to  accomplish  a  programming  goal,  in 
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addition  to  error  feedback.  The  same  advice  is  available  at  each  goal  in  these  tutors  as  in 
the  standard  tutor. 

In  the  no-feedback,  demand-feedback,  and  eiror-flagging  tutors  the  student  might  generate 
a  complete  working  solution  that  the  tutor  does  not  recognize  (cannot  generate).  As  a 
result,  the  tutor  will  try  any  code  that  it  does  not  recognize  on  a  set  of  test  cases  and  will 
accept  the  solution  if  it  works.  In  practice,  this  happens  rarely.  Only  about  5%  of  the 
students’  unrecognized  solutions  worked. 

We  compared  the  four  versions  of  the  tutor  across  a  five  lesson  sequence  that  took  students 
from  the  easiest  (introductory)  lessons  to  the  most  challenging  (recursion)  lessons  (Corbett 
&  Anderson,  1991).  Each  student  attempted  a  fixed  set  of  exercises  (not  necessarily 
enough  to  reach  mastery)  with  one  of  the  four  versions.  Students  completed  a  paper-and- 
pencil  posttest  and  an  on-line  posttest  in  a  standard  LISP  environment.  There  were  no 
significant  differences  among  the  immediate  feedback  tutor,  flag  tutor,  nor  demand 
feedback  tutor  in  either  of  these  posttest  environments.  Mean  scores  across  the  two 
posttest  environments  were  55%,  55%  and  58%  correct,  respectively.  All  three  groups 
were  reliably  superior  to  the  no-feedback  control  group  (43%  correct)  in  both  posttest 
environments. 

The  time  to  complete  the  tutor  exercises  is  displayed  in  Figure  5.  As  can  be  seen,  the 
conditions  are  ordered  in  terms  of  tutor  support:  immediate  feedback  is  best  and  it  is 
followed  by  error  flagging,  demand  feedback,  and  no-feedback.^^  Students  in  the  three 
feedback  conditions  necessarily  arrived  at  working  solutions  to  each  exercise.  While 
students  in  the  no-feedback  control  condition  attempted  every  exercise,  they  failed  to  solve 
25%.  Thus,  the  only  condition  to  show  inferior  posttest  performance  was  the  condition  in 

Subjects  are  taking  longer  in  the  later  lessons  because  the  exercises  are  longer  and  harder. 
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which  students  failed  to  solve  all  the  problems.  This  reinforces  our  conclusions 
(Anderson,  Conrad  &  Corbett,  1989)  that  posttest  performance  is  primarily  governed  by 
the  set  of  exercises  that  students  successfully  solve  and  understand. 

Figure  5  depicts  the  3-1  elapsed  time  ratio  cited  earlier  between  immediate  feedback  and  no¬ 
feedback.  This  ratio  underestimates  the  true  benefit  since  students  did  not  solve  all  the 
exercises  in  the  no-feedback  condition.  It  may  underestimate  the  benefit  over  a  typical 
classroom  in  that  students  in  all  conditions  at  least  had  access  to  declarative  instruction  and 
problems  carefully  designed  to  communicate  and  teach  the  target  productions.  A  typical 
classroom  may  well  be  less  organized. 

An  analysis  of  students’  performance  in  the  error  flagging  and  demand  feedback  conditions 
indicated  that  students  in  these  two  conditions  responded  fairly  passively  to  the  control  they 
were  offered.  When  errors  were  noted  immediately  in  the  error  flagging  condition, 
students  fixed  the  errors  immediately  80%  of  the  time.  On  the  other  hand,  when  the  tutor 
did  not  volunteer  error  information  in  the  demand-feedback  condition,  students  rarely 
interrupted  their  coding  activity  to  ask  for  help  or  evaluation.  In  the  demand-feedback 
conditions  students  did  not  ask  for  feedback  until  they  had  completed  a  preliminary  solution 
in  90%  of  the  exercises.  Finally,  we  asked  students  how  well  they  liked  the  tutor  and  the 
feedback  they  received.  Perhaps  surprisingly  there  were  no  reliable  differences  across 
groups,  although  there  was  an  interaction  with  the  curriculum;  The  harder  the  exercises 
became  the  more  students  appreciated  immediate  help. 


Insert  Figure  5  About  Here 


Feedback  Content 
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A  key  feature  of  a  tutor  is  what  it  says  to  the  student  during  a  problem-solving  episode. 
There  are  two  obvious  occasions  for  communicating  to  the  student  during  problem  solving. 
One  is  when  the  student  makes  some  error  and  the  tutor  can  comment  on  the  error.  The 
other  is  when  the  student  asks  for  help  or  appears  to  need  help  and  some  help  message  is 
given.  Obviously,  if  students  never  get  any  information  about  errors  in  their  solutions  they 
are  not  going  to  learn  to  avoid  them.  Similarly,  if  students  never  receive  any  help  of  any 
sort,  they  are  in  danger  of  becoming  permanently  stuck  on  some  problems.  However,  one 
can  construct  a  tutor  in  which  errors  are  just  flagg^  as  such  and  correct  solutions  pointed 
out  without  any  accompanying  explanation.  One  can  then  ask  what  the  potential  benefit  is 
of  the  accompanying  explanations.  We  have  performed  such  comparisons  twice — once 
with  the  LISP  tutor  and  once  with  the  geometry  tutor. 

In  the  LISP  tutor  (Anderson,  Conrad,  and  Corbett,  1989)  we  took  a  number  of  measures 
of  how  much  error  messages  helped.  One  was  how  well  students  performed  on  individual 
productions.  We  found  students  made  fewer  errors  per  production  if  they  were  receiving 
explanatory  feedback  (15%  versus  22%).  Also  when  they  made  an  error  and  received 
feedback  (an  explanation  not  just  that  they  were  wrong)  they  were  more  likely  to  correct 
their  error  on  the  first  attempt  (65%  corrections  versus  38%).  However,  when  we  looked 
for  long-term  learning  benefits  we  failed  to  find  any  significant  differences.  On  a  quiz 
immediately  after  the  tutor  exercises  students  with  explanatory  feedback  got  90%  correct 
and  those  without  got  91%  correct.  When  we  looked  at  their  performance  on  a  final  exam 
students  with  explanatory  feedback  got  76%  correct  while  those  without  got  80%  correct. 
Neither  difference  approached  statistical  significance. 

Thus,  the  impact  of  such  instruction  in  the  LISP  tutor  was  to  facilitate  the  students’ 
progress  through  the  material  but  did  not  have  any  permanent  achievement  consequences. 
This  is  not  an  insignificant  outcome  since  speed  of  learning  is  a  critical  dependent  measure. 


[Manuscript]  Tutor.Manuscript 


30 


August  12, 1994 


Reasonable  feedback  messages  also  appeared  to  have  a  positive  impact  on  the  perception  of 
the  tutor.  That  is,  students  had  numerous  derogatory  comments  to  make  about  the  no¬ 
feedback  tutor  even  if  they  eventually  learned  as  much  with  it. 

McKendree’s  (1990)  evaluation  of  the  feedback  messages  in  the  geometry  tutor  came  to 
somewhat  different  conclusions.  In  her  first  dissertation  study  (McKendree,  1986)  she 
found,  as  in  the  LISP  tutor,  that  these  messages  facilitated  progress  through  the  tutor  but 
did  not  have  any  permanent  benefit.  Frustrated  with  this  lack  of  permanent  benefit  she 
went  through  and  specifically  tuned  the  feedback  messages  to  be  particularly  cogent  to  the 
specific  problems.  In  her  second  evaluation  (reported  in  McKendree,  1990)  she  was  able 
to  show  a  benefit  in  terms  of  both  progress  through  the  tutor  and  final  achievement. 

McKendree  performed  a  theoretical  analysis  of  why  her  students  benefited  from  the 
carefully  crafted  feedback  messages.  She  was  able  to  show  that  students  had  failings  in 
their  underlying  declarative  knowledge  which  the  feedback  was  able  to  correct.  Some 
students  without  the  feedback  were  able  to  get  through  the  tutor  without  really  correcting 
their  misunderstandings  and  the  holes  in  their  knowledge.  The  tutor  had  no  mastery-based 
instructional  curriculum  and  students  just  had  to  get  through  a  fixed  number  of  problems. 
This  suggests  that  we  have  a  time-achievement  tradeoff  and  so  suggests  a  way  of 
reconciling  the  results  with  the  LISP  tutor.  In  the  LISP  tutor  experiment,  students  were 
given  enough  problems  to  reach  a  high  level  of  achievement  and  the  effect  of  feedback  was 
on  their  time  to  reach  that  level.  In  the  geometry  tutor,  students  went  through  the  problems 
in  relatively  constant  time  (and  a  short  amount  of  time  in  hours)  and  the  effect  showed  up  in 
their  achievement  levels. 

When  one  designs  help  messages  one  tends  to  wax  on  in  the  messages  to  the  student  both 
to  make  the  tutor  seem  intelligent  and  to  communicate  one’s  insights  into  the  problem. 
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Students  take  a  rather  different  attitude.  They  realize  it  is  just  a  computer  which  at  best  is 
just  a  tool  to  help  them  learn  and  they  have  no  interest  in  someone  else’s  prose.  They  want 
to  solve  the  problem  and  are  often  impatient  with  long  messages.  In  a  study  with  the 
algebra  tutor  Lewis  (1989)  compared  terse  messages  with  longer  messages  more  like 
natural  English  which  were  originally  used  with  the  tutor.  He  found  that  students  actually 
did  better  with  the  shorter  messages,  although  the  effect  was  not  statistically  significant. 

Stage  3:  Practical  Deployment 


Despite  the  successful  empirical  evaluations  of  our  work  on  tutoring,  our  tutors  had  not 
been  used  much.  The  programming  tutor  was  regularly  used  at  CMU  and  occasionally 
used  elsewhere.  A  scaled-down  version  of  the  geometry  tutor  was  ported  to  a  Mac  SE  and 
that  had  been  used  occasionally  in  classrooms  around  the  country.  However,  until  very 
recently  our  software  had  not  played  a  significant  and  permanent  role  in  the  instructional 
plans  of  any  organization  outside  of  our  own  CMU  classrooms.  While  the  machines  we 
were  working  with  were  large,  impractical  AI  behemoths,  this  lack  of  practical 
demonstration  was  not  a  salient  issue.  However,  now  our  tutors  can  be  deployed  on 
machines  which  are  conceivable  in  American  classrooms. 

When  we  examined  why  we  were  so  far  from  practical  tutors  it  became  apparent  that  we 
had  avoided  addressing  a  number  of  issues: 

(1)  There  was  never  any  attempt  on  our  part  to  address  the  curriculum  that 
educators  wanted  to  teach.  There  is  no  more  apparent  case  of  this  than  the  situation 
with  our  geometry  tutors  which  focused  on  teaching  proof  skills  while  mathematics 
educators  were  stressing  more  general  reasoning  and  problem-solving  skills. 
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(2)  There  was  no  thought  given  to  what  would  happen  to  students  after  they  passed 
through  our  tutors.  We  always  took  as  our  measure  of  success  performance  on 
some  final  test.  However,  to  be  educationally  valuable  our  tutors  have  to  fit  in  with 
some  larger  set  of  curriculum  objectives. 

(3)  The  systems  that  we  developed  were  inflexible  in  the  way  they  had  to  be  used 
and  gave  teachers  no  ability  to  tune  the  application  of  the  tutors  to  their  own  needs 
and  beliefs  about  instruction. 

(4)  The  was  little  understanding  of  how  to  support  the  deployment  of  these  tutors  in 
the  classroom.  Relevant  to  this  is  the  observation  that  we  have  not  had  a  positive 
classroom  evaluation  that  did  not  involve  teachers  who  had  spent  extensive  time 
involved  in  the  project. 

Addressing  these  problems  has  caused  us  to  go  beyond  our  original  goals  of  showing  that 
our  cognitive  models  can  lead  to  successful  learning.  We  have  now  begun  to  address  the 
issues  of  how  to  develop  tutors  which  will  implement  an  externally  specified  curriculum, 
which  can  be  deployed  in  a  wide  range  of  classrooms,  and  which  leave  students  with  a 
competence  that  makes  a  demonstrable  contribution  to  their  activities  outside  of  the  specific 
domains  taught  by  the  tutor.  We  have  undertaken  two  major  endeavors  in  response  to 
these  new  agenda.  One  has  been  to  create  a  development  system  for  creating  such 
cognitive  tutors  and  to  begin  to  work  on  a  development  discipline  for  their  creation.  The 
second  has  been  to  strike  up  a  close  relationship  with  the  Pittsburgh  Public  Schools  in 

^^This,  of  course,  raises  interesting  issues  about  evaluation.  Since  we  have  not  gotten  positive  results 
simply  by  putting  computers  in  the  classroom,  this  indicates  our  positive  results  with  project  teachers  is 
not  simply  a  Hawthorne  effect.  The  control  classes  (no  tutor  classes)  of  our  project  teachers  do  at  least  as 
well  as  the  control  and  tutored  classes  of  non-project  teachers.  Thus,  there  is  something  special  about  the 
combination  of  the  tutor  and  the  teachers’  preparation  to  use  them.  We  are  working  with  the  Pittsburgh 
Public  Schools  to  try  to  develop  an  appropriate  teacher  training  program. 
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which  they  serve  as  our  clients  and  we  try  to  build  instructional  software  which  can  be  used 
in  their  classrooms. 

We  have  three  permanent  classrooms  of  Mac  IIs  and  Quadras  in  three  local  city  high  school 
and  additional  classrooms  are  currently  being  planned  in  other  city  high  schools.  We  are 
working  towards  supporting  the  city’s  mathematics  and  computer  science  curriculum.  The 
emphasis  in  the  mathematics  curriculum  is  to  implement  the  NGTM  standards  (National 
Council  of  Teachers  of  Mathematics,  1988)  in  an  urban  setting.  A  major  issue  here  is  to 
teach  a  curriculum  which  will  empower  students  to  participate  in  modem  society.  This  is  a 
particularly  significant  issue  in  a  large  urban  school  system  with  many  students  coming 
from  economically  disadvantaged  families. 

We  have  created  a  succession  of  development  systems  (Anderson,  Corbett,  Fincham, 
Hoffman,  &  Pelletier,  1992;  Anderson  &  Pelletier,  1991).  We  have  implemented  in  them 
tutors  for  three  programming  languages  (LISP,  Pascal,  and  Prolog),  for  elementary 
arithmetic,  and  for  Algebra  I.  We  have  plans  to  build  a  tutor  for  geometry  in  this  system 
which  will  extend  the  geometry  tutor  of  Koedinger  and  Anderson  (1993)  to  combine 
construction,  exploration,  conjecture,  and  proof.  A  major  goal  in  this  tutor  development 
system  is  usability.  This  means  both  facilitating  the  teachers’  use  and  modification  of  the 
systems  and  enabling  as  many  people  as  possible  to  develop  software  in  the  system. 

The  actual  process  of  developing  a  tutor  has  five  identifiable  stages:  interface  construction, 
curriculum  specification,  cognitive  modeling,  design  of  instruction,  and  classroom 
deployment.  We  will  discuss  each  of  these. 

Interface  Construction  and  the  Issue  of  Transfer 
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The  first  step  in  developing  a  tutor  is  to  define  the  world  in  which  the  student’s  problem 
solving  is  going  to  take  place.  This  will  be  the  interface  between  the  student  and  the 
computer.  Placing  interface  design  ahead  of  production-system  design  represents  a  major 
restructuring  of  our  approach  to  tutor  construction.  In  our  early  efforts  we  started  with  an 
abstract  production-rule  model  of  the  cognitive  skill.  Interface  design  was  a  secondary 
though  nontrivial  task  in  which  we  considered  optimal  ways  to  depict  productions  and 
goals  on  the  screen  and  appropriate  sets  of  student  actions.  Our  current  view  is  that  the 
skill  we  are  teaching  is  problem  solving  in  a  particular  interface.  Therefore,  the  interface 
must  be  designed  before  we  can  identify  the  production  rules.  The  significant  issue  that  we 
must  face  in  interface  design  is  transfer.  The  interface  students  learn  will  have  a  large 
impact  on  where  their  skills  will  transfer. 

In  designing  an  interface  one  must  keep  in  mind  the  domain  to  which  the  skill  is  supposed 
to  transfer.  Often  that  domain  is  still  paper  and  pencil.  For  instance,  most  college 
mathematics  departments  still  expect  incoming  students  to  be  proficient  at  paper  and  pencil 
algebraic  manipulations.  However,  there  is  an  increasing  tendency  for  the  target  skill  to 
involve  use  of  computer  software.  Thus,  part  of  the  competence  we  are  trying  to  teach  in 
the  current  algebra  tutor  is  how  spreadsheets,  symbol  manipulation  packages,  and  graphing 
routines.  Having  identified  the  target  skills,  one  must  design  the  interface  to  enable  transfer 
to  these  target  skills.  The  issue  of  transfer  is  one  of  psychology  and  here  it  is  worth 
distinguishing  three  levels  at  which  transfer  can  occur: 

(1)  Identical  Productions:  The  production  rules  for  tutor  exercises  may  be  identical  to 
those  for  the  target  domain.  In  this  case  we  would  expect  total  transfer.  This  might  be  the 
case,  for  instance,  if  the  target  domain  were  a  computer  system  and  our  tutor  taught  how  to 
use  it.  In  other  cases  the  tutor  productions  will  only  overlap  with  those  for  the  target 
domain.  In  this  case.  Poison  and  Kieras  (1985)  and  Singley  and  Anderson  (1989)  have 
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shown  that  the  amount  of  transfer  will  be  a  function  of  the  degree  of  production-rule 
overlap.  For  instance,  our  programming  tutors  focus  on  code  selection  and  not  syntax  with 
a  structure  editor  providing  the  syntax.  Thus,  students  graduating  from  our  tutors  are 
successful  coders  but  have  some  difficulty  with  syntax  when  tested  without  the  structure 
editor  since  they  have  not  acquired  the  necessary  syntax  productions.  While  students  have 
to  pick  up  syntax  after  mastering  the  other  aspects  of  the  language,  this  does  not  appear  to 
be  a  major  learning  hurdle  (Goldenson,  1989a,  1989b). 

< 

(2)  Translating  actions.  Even  if  the  tutor  productions  and  target  domain  productions  are 
different  it  may  be  apparent  to  the  student  how  to  convert  the  actions  of  the  learned 
productions  into  appropriate  behavior  in  the  target  domain.  For  instance,  we  find  relatively 
high  transfer  from  a  tutor  which  has  students  select  programming  constructs  by  menu  to  a 
test  environment  that  requires  the  student  to  recall  these  constructs.  This  is  because  it  is 
pretty  apparent  to  most  students  that  if  they  have  been  selecting  "writeln"  from  a  menu  in 
the  tutor,  they  should  type  that  in  when  writing  code  into  a  standard  file.  In  the  1986-1987 
geometry  tutor  study  we  found  high  transfer  of  students  from  doing  proofs  in  the  proof 
graph  formalism  to  the  two-column  proof  formalism.  This  is  a  less  obvious  translation  but 
the  teacher  had  gone  over  with  them  how  the  proof  graph  related  to  the  traditional  two- 
column  proof. 

(3)  Declarative  Transfer.  Even  if  actions  in  one  domain  cannot  be  directly  translated  to 
actions  in  another,  there  can  be  declarative  transfer  of  the  underlying  competence.  Thus, 
we  find  that  students  who  practice  coding  with  the  LISP  tutor  do  fairly  well  in  evaluating 
LISP  expressions  (although  there  is  hardly  total  transfer  between  these  activities — 
Anderson,  Conrad,  &  Corbett,  1989).  This  is  because  both  skills  rest  on  the  same 
declarative  understanding  of  LISP  and  students  must  get  their  declarative  representation 
right  before  they  can  acquire  successful  productions  for  coding. 
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There  has  been  a  great  deal  of  interest  of  late  in  occasions  where  students  fail  to  transfer 
across  domains  (Lave  &  Wenger,  1990).  In  our  perspective  there  is  nothing  mysterious 
about  when  transfer  will  occur  and  when  it  will  not.  Transfer  requires  students  to  have 
learned  production  rules  in  the  training  domain  which  will  solve  problems  in  the  target 
domain.  In  the  cases  that  the  actions  of  the  rules  in  the  training  domain  are  different  than 
what  is  needed  in  the  target  domain  (e.g.,  menu  selection  versus  writing)  it  must  be 
apparent  to  the  student  how  to  map  one  action  into  another.  In  some  writings  one  gets  the 
impression  that  lack  of  transfer  is  the  rule.  However,  our  research  on  tutoring  shows  that 
transfer  as  predicted  by  production  rule  overlap  is  quite  common.  Every  time  we  report  a 
positive  result  in  paper-and-pencil  test  outside  of  one  of  our  tutors  it  is  a  case  of  transfer. 

There  are  two  approaches  to  interface  construction  in  our  tutor  development  package.  One 
is  to  build  the  interface  ourselves.  We  have  a  set  of  primitives  for  facilitating  the 
development  of  such  interfaces  and  relating  these  interfaces  to  our  production  model.  The 
other  possibility  is  to  take  the  existing  piece  of  software  and  add  hooks  to  it  so  that  it  is 
linked  into  our  tutoring  system.  In  either  case,  the  following  are  the  requirements  for  the 
interaction  with  the  interface: 

(1)  Actions  taken  to  the  interface  must  be  passed  through  the  tutor.  The  tutor  needs 
to  know  what  actions  students  have  taken  so  it  can  follow  students  along  the 
solution  path  they  are  pursuing  and  provide  appropriate  guidance. 

(2)  The  tutor  must  be  informed  about  the  consequences  of  any  interface  action  for 
the  state  of  the  interface.  Basically,  the  cognitive  model  needs  to  maintain  in  its 
working  memory  a  representation  of  the  interface  that  the  students  see. 
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(3)  The  tutor  must  be  able  to  perform  interface  actions  itself. 

Subject  to  these  constraints,  there  is  unlimited  flexibility  in  the  kind  of  interfaces  we  can 
tutor.  We  are  particularly  attracted  to  taking  generally  available  pieces  of  software  and 
tutoring  students  on  problem  solving  within  that  software.  When  students  leave  our  tutors, 
they  will  still  have  a  useful  problem-solving  environment  (and  the  transfer  problem  is 
minimized).  So,  for  instance,  we  are  doing  some  work  on  tutoring  students  on  algebraic 
problem-solving  using  Excel  and  geometric  conjecture  using  Sketchpad  (Key  Curriculum 
Press,  1991).  In  using  such  an  interface,  we  can  take  advantage  of  the  many  years  of 
effort  that  went  into  making  it  flexible,  reliable,  and  efficient. 

Curriculum  Construction 

Once  having  specified  the  environment  in  which  the  students  are  going  to  display  their 
problem-solving  competence  we  come  to  the  issue  of  exactly  what  competence  they  are  to 
display  in  that  domain.  Specifying  the  competence  comes  down  to  identifying  the  type  of 
problems  students  are  expected  to  solve  in  the  domain  and  the  constraints  on  their  problem 
solving.  Here  our  attitude  is  to  take  our  specification  from  the  educational  community  that 
is  our  client.  For  instance,  in  working  with  the  Pittsburgh  Public  Schools  mathematics 
educators  we  take  their  input  as  to  what  problems  they  want  students  to  solve.  Fortunately, 
we  have  chosen  a  client  who  is  at  the  lead  in  trying  to  achieve  NCTM  standards  in  an  urban 
environment.  Therefore,  we  are  confident  our  tutors  will  have  broader  applicability. 

Our  clients  also  have  strong  input  on  the  computer  problem  solving  interface.  So,  for 
instance,  in  the  case  of  geometry  it  was  the  Pittsburgh  Public  Schools  that  chose  we  should 
work  with  the  Geometer’s  Sketchpad  (Key  Curriculum  Press,  1991).  However,  the  exact 
form  of  that  problem-solving  interface  was  already  determined  by  other  forces.  They  made 
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their  choice  of  it  much  as  they  would  make  a  textbook  adoption.  They  cannot  specify  the 
microstructure  of  the  interface  any  more  than  they  can  specify  the  exact  content  of  a 
textbook.  However,  like  a  textbook,  they  can  specify  how  the  interface  is  to  be  used. 

Within  a  particular  interface  our  clients  have  their  conception  of  the  problems  they  want 
students  to  solve  and  the  constraints  under  which  they  want  students  to  solve  the  problems. 
The  issue  of  constraints  is  key  here.  For  instance,  a  client  may  want  to  use  a  piece  of 
software  which  has  an  algebraic  symbol  manipulation  package  but  may  want  to  prevent  the 
student  from  using  that  package  in  certain  parts  of  the  curriculum  to  exercise  that  student's 
own  symbol  manipulation  abilities.  This  amounts  to  the  sorts  of  instructions  a  teacher 
might  give  the  student  about  how  to  solve  a  problem. 

While  our  software  is  initially  developed  in  response  to  the  needs  of  one  client,  other  clients 
may  want  to  use  it  with  somewhat  other  goals  in  mind.  This  means  we  must  allow  them  to 
select  the  constraints  under  which  the  problems  are  solved  and  the  problems  which  are 
actually  solved.  It  is  useful  to  have  a  facility  so  teachers  can  enter  new  problems.  Also 
educators  need  to  have  access  to  some  of  the  tutoring  options.  Earlier  we  described  the 
variety  of  tutoring  modes  such  as  immediate  feedback,  flag  tutoring,  and  demand  tutoring. 
Our  current  tutor  development  kit  permits  all  of  these  different  tutoring  modalities  and  the 
educator  is  able  to  choose  among  them. 

Our  tutors  will  track  the  students’  performance  on  various  production  rules  (knowledge 
tracing)  and  promote  students  through  the  curriculum  as  they  achieve  mastery  on  these 
rules  (mastery  learning).  Again  teachers  need  to  have  the  ability  to  turn  mastery  learning  or 
knowledge  tracing  off  or  to  override  these  facilities  at  various  points. 

Production  System  Modeling 
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Specifying  a  problem-solving  environment  and  a  set  of  constrained  problems  to  be  solved 
in  that  environment  amounts  to  a  behavioral  specification  of  the  target  competence.  Our 
major  task  as  cognitive  modelers  is  to  figure  out  what  that  competence  means  in  terms  of  a 
set  of  underlying  production  rules  that  are  capable  of  generating  that  behavioral  competence 
in  a  cognitively  plausible  way.  This  is  the  task  of  constructing  a  student  model.  Such  a 
student  model  is  runnable  in  the  sense  that  it  can  send  actions  to  the  interface  which  would 
constitute  a  correct  solution  to  the  problem.  In  most  cases  there  is  more  than  one  possible 
solution  path  and  the  ideal  student  model  must  be  capable  of  nondeterministically 
generating  all  the  solutions. 

The  production  mles  respond  to  information  in  working  memory.  Typically  information  in 
working  memory  will  be  of  two  kinds:  information  about  what  the  current  state  of  the 
problem  is  and  a  representation  of  what  the  goal  is.  In  some  cases,  like  the  statement  to  be 
proved  in  a  geometry  problem,  the  goal  representation  is  straightforward.  However,  when 
the  goal  is  stated  in  natural  language,  as  in  the  case  of  a  programming  problem,  it  can  be 
quite  problematical  how  to  represent  it.  We  do  not  want  to  represent  or  model  the  natural 
language  processing  that  is  involved  in  understanding  the  statement:  this  would  just  be  too 
much  to  feasibly  model.  Therefore,  we  represent,  in  some  form,  what  we  believe  to  be  the 
product  of  the  natural  language  understanding.  The  problem  is  that  it  is  hard  to  resist 
building  into  that  representation  part  of  the  solution  to  the  problem.  Thus,  we  may  not 
adequately  represent  the  problem  that  the  student  faces  or  the  skills  that  need  to  be  learned. 

Once  the  production  rules  for  solving  the  problem  are  specified,  one  needs  to  be  able  to 
match  up  the  student’s  behavior  with  these  rules.  This  requires  augmenting  the  rules  with 
tests  that  match  against  the  student’s  behavior  so  that  the  tutor  can  determine  which  rules 
have  fired  in  the  student’s  head.  In  the  case  of  ambiguities,  disambiguation  menus  must  be 
generated  from  templates  stored  with  the  rules. 


[Manuscript]  Tutor.Manuscript 


40 


August  12, 1994 


The  product  of  this  effort  might  be  viewed  as  an  “instructionless”  tutor,  like  the  non¬ 
explanation  tutors  described  in  the  evaluation  section,  that  can  be  deployed  in  a  number  of 
tutor  modalities  (e.g.,  immediate  feedback,  flag,  demand,  no  tutor).  It  can  identify  for  a 
student  where  mistakes  are  and  indicate  correct  courses  of  action.  However,  it  can  say 
nothing  about  why  one  action  is  wrong  and  another  is  correct.  That  awaits  the  construction 
of  declarative  instruction  in  the  next  stage. 

Declarative  Instruction 

As  we  noted  in  the  beginning  of  this  paper,  part  of  the  domain  competence  comes  from 
declarative  instruction  given  outside  of  the  tutor.  Successful  operation  of  the  tutor  assumes 
successful  acquisition  of  this  declarative  knowledge.  Some  of  this  declarative  instruction 
concerns  general  concepts  (e.g.,  what  an  alternate  interior  angle  is)  and  other  communicates 
information  that  will  serve  as  the  declarative  basis  from  which  production  rules  are 
compiled  (e.g.,  one  way  to  prove  lines  parallel  is  to  show  that  their  alternate  interior  angles 
are  congruent).  This  declarative  instruction  may  come  in  class  lectures  or  in  text  material. 
We  have  often  provided  the  student  with  specially  written  material  to  accompany  the  tutor. 
Recently  we  have  had  success  using  hypertext  facility  that  can  be  accessed  in  parallel  with 
the  tutor.  The  content  of  this  instruction  is  informed  by  the  production  rules  that  are  to  be 
learned  in  the  upcoming  section.  The  instruction  tries  to  provide  examples  that  illustrate  the 
rules  and  annotate  those  examples  with  comments  that  will  highlight  the  significant  aspects 
of  the  rules.  A  general  principle  in  our  approach  to  instruction  is  to  be  minimalist  and  not 
say  more  than  is  needed.  This  sensible  approach  tends  not  to  be  followed  in  most 
textbooks  but  is  well  supported  by  research  (Reder  &  Anderson,  1980;  Reder,  Chamey  & 
Morgan,  1986). 
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While  this  tutor-external  instruction  is  important,  of  more  concern  to  the  tutor  development 
system  is  the  declarative  instruction  delivered  from  within  the  tutor.  This  is  of  two  kinds: 

(1)  Error  Messages.  When  the  student  makes  an  error  one  can  present  a 
message  that  attempts  to  tell  the  student  something  useful  about  that  error.  This 
requires  writing  buggy  productions  and  attaching  instruction  from  these 
productions.  In  general  we  do  not  attempt  to  provide  any  deep  diagnosis  of  the 
cognitive  origins  of  the  error.  Rather  we  simplj'  try  to  explain  why  it  is  an  error. 

(2)  Help  Messages.  At  various  points  in  time  the  student  can  request  help  or  be 
judged  in  need  of  help  and  a  help  message  can  be  generated.  These  are  generated 
from  templates  associated  with  the  correct  productions  which  would  have  fired  at 
that  point. 

There  are  a  number  of  issues  about  how  to  present  the  help  messages.  We  have  striven  for 
a  system  which  tries  to  make  these  messages  as  short  and  to  the  point  as  possible  even  if 
the  messages  sound  nonhuman.  Another  issue  concerns  modality  of  delivery.  While  we 
have  always  given  these  in  the  visual  modality  we  would  like  to  add  an  auditory  modality 
for  instruction.  Instructing  in  the  visual  modality  interferes  with  processing  of  the  problem 
which  is  also  in  the  visual  modality.  Instruction  in  the  auditory  modality  would  increase 
the  premium  placed  on  short  messages  (Ladday,  Levine,  &  Suppes,  1981). 

Another  issue  concerns  how  to  deal  with  students  who  overuse  hints  and  as  a  consequence 
learn  little  (Shute,  Woltz,  &  Regian,  1989).  We  find  that  linking  knowledge  tracing  to  help 
seeking  is  an  effective  way  of  dealing  with  hint  abusers.  If  their  progress  through  the  tutor 
depends  on  eventually  solving  the  problems  without  help,  students  will  not  seek  help 
unless  they  really  need  it. 
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Two  questions  remain  unresolved  with  respect  to  help  messages.  These  are  whether  to 
volunteer  help  to  students  who  appear  to  need  it  and  whether  to  present  students  everything 
in  a  single  message  or  whether  to  provide  a  sequence  of  successively  more  explicit 
messages.  The  current  hinting  discipline  in  the  tutor  was  designed  to  let  students  do  as 
much  as  possible  for  themselves.  This  was  motivated  by  research  in  psychology  showing 
that  subjects  have  better  memory  for  material  to  the  degree  they  participate  in  the  generation 
of  that  material  (Anderson,  1990,  Ch.  7).  Thus,  our  current  tutors  never  volunteer  help 
and  only  provide  help  upon  request.^^  Also  our  current  tutors  use  a  scheme  of  successive 
hinting  in  which  the  initial  help  only  gives  a  vague  characterization  and  subsequent  help 
messages  become  more  specific  until  the  student  is  told  exactly  what  to  do.  However, 
these  may  not  be  the  best  choices.  Some  students  stubbornly  refuse  to  seek  help  even 
when  they  need  it.  Also,  with  respect  to  the  policy  of  successive  hinting,  students  are  often 
annoyed  with  the  vague  initial  messages  and  decide  there  is  no  point  in  using  the  help 
facility  at  all.  The  deployment  of  the  tutor  in  courses  may  also  influence  the  content  of  help 
messages.  When  students  are  using  the  tutor  in  a  self-paced  course  or  otherwise  on  their 
own,  it  is  essential  to  tell  students  exactly  what  to  do  if  necessary  to  allow  them  to  proceed. 
In  a  classroom,  it  may  be  preferable  for  the  students  to  interact  with  the  teacher  if  they  do 
not  understand  an  explanation,  so  help  messages  may  stop  short  of  describing  the  specific 
action. 


Deployment  In  the  Classroom 

It  is  interesting  to  consider  how  these  tutors  are  actually  deployed.  At  Carnegie  Mellon  the 
programming  tutors  are  used  in  self-paced  leam-on-your-own  environments.  There  are 
class  sessions  associated  with  the  tutor  but  they  are  largely  used  for  administrative 

can  be  viewed  as  embedding  opportunities  within  the  tutors  for  students  to  discover  the  concepts  for 
themselves. 
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purposes.  Students  learn  successfully  from  reading  the  prepared  text  and  doing  problems. 
They  can  go  to  the  teacher  if  they  have  special  problems  that  the  tutor  cannot  handle  and 
occasionally  they  do.  They  also  do  larger  (but  still  modest)  projects  outside  of  the  tutor 
counting  on  competencies  they  have  built  up  with  the  tutor. 

While  this  relatively  teacherless  and  isolated  model  has  worked  reasonably  well  for  our 
programming  course,  this  model  is  not  viable  for  the  general  deployment  of  tutors  in  public 
high  schools.  The  Carnegie  Mellon  model  depends  on  the  following  facts:  (1)  we  have 
designed  our  programming  tutors  to  deliver  just  the  material  we  want  to  teach,  (2)  we  have 
total  control  over  our  classroom,  (3)  we  are  working  with  relatively  mature  students  who 
come  in  on  their  own  time  and  are  generally  familiar  with  computers,  and  (4)  we  expect 
students  in  introductory  programming  courses  to  display  their  skills  isolated  from  other 
students.  None  of  these  assumptions  are  satisfied  in  general.  The  particular  nature  of  our 
programming  class  has  made  it  an  excellent  laboratory  for  study  of  skill  acquisition  and 
issues  of  tutoring  but  it  has  made  it  non-representative  for  how  these  tutors  will  be 
deployed  in  other  educational  environments. 

The  typical  educational  environment  in  an  American  high-school  mathematics  class 
contrasts  with  this  situation  in  several  ways.  First,  there  are  large  curriculum  variations 
across  states  and  school  districts  and  smaller  variations  across  almost  every  teacher  within  a 
district.  As  a  consequence,  any  mathematics  tutor  will  likely  be  delivering  only  part  of  the 
curriculum  that  a  particular  teacher  delivers,  so  the  tutor  will  be  integrated  with  other 
classroom  activities.  Second,  students  are  not  mature  enough  to  simply  show  up  at  a 
teacherless  class  and  learn.  They  will  get  stuck  too  often  in  ways  that  the  tutor  cannot 
remediate  and  discipline  problems  would  develop.  Finally,  the  National  Council  of 
Teachers  of  Mathematics  (NCTM)  has  placed  a  major  emphasis  on  teaching  group  problem 
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solving  in  their  standards  (1989, 1991),  so  group  activities  will  come  to  replace  individual 
skill  performance  to  varying  degrees  across  classrooms. 

Our  tutors  have  had  some  successes  in  such  classrooms.  Currently,  30  classes  in  algebra, 
geometry,  and  Pascal  programming  are  using  a  classroom  of  Mac  n  based  tutors  at  a  local 
Pittsburgh  high  school.  Students  in  these  courses  alternate  between  working  tutor 
exercises  and  other  classroom  activities,  so  tutor  use  has  the  flavor  of  going  to  a  computer 
laboratory  within  the  context  of  a  conventional  course,  although  students  may  spend  as 
much  as  2/3  of  their  class  time  in  the  laboratory. 

We  are  struck  by  the  way  students  interact  with  these  tutors  and  the  consequences  for  class 
organization.  Much  of  this  was  described  in  the  early  reports  of  Schofield  and  Evan- 
Rhodes  (1989)  and  Wertheimer  (1990)  but  the  effects  are  even  more  striking  in  the  current 
classrooms  reflecting  the  larger  class  sizes  and  the  social  changes  since  the  original 
classroom  studies.  When  students  are  in  the  laboratory  they  are  working  one-on-one  on 
machines  but  that  hardly  means  they  are  working  in  isolation.  There  is  a  constant  banter  of 
conversation  going  on  in  the  classroom  where  different  students  compare  their  progress 
and  help  one  another.  Peer  instruction  is  particularly  key  in  cases  where  students  have  to 
adapt  to  a  new  interface  feature.  Information  about  how  to  use  that  features  propagates 
through  the  classroom  much  like  information  about  how  to  use  a  new  trick  in  a  Nintendo 
game.  We  have  come  to  realize  that  our  tutors  would  be  less  successful  if  such  peer 
assistance  were  not  available.  Peer  helping  may  also  be  a  good  way  for  the  helper  to  come 
to  a  deeper  understanding  of  the  material.  An  effective  teacher  is  quite  active  in  such  a 
classroom,  circulating  about  the  class  and  providing  help  to  students  who  cannot  get  the 
help  they  need  from  either  the  tutor  or  their  peers.  The  tutor  in  effect  becomes  ^  assistant 
that  can  deal  with  the  more  routine  learning  problems  allowing  the  teacher  to  focus  on  the 
more  difficult.  By  means  of  its  knowledge  tracing  algorithm  it  also  is  able  to  monitor 
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separately  the  progress  of  individual  students,  providing  a  bookkeeping  facility  the  teacher 
would  never  be  able  to  accomplish.  Teachers  seem  to  require  some  time  in  the  classroom 
before  they  appreciate  the  “tutor  as  teaching  assistant”  model  and  can  use  it  to  its  maximum 
potential. 

Students’  own  attitudes  to  the  tutor  classrooms  are  quite  positive  to  the  point  of  creating 
minor  discipline  problems.  Students  skip  other  classes  to  do  extra  work  on  the  tutor, 
refuse  to  leave  the  class  when  the  period  is  over,  and  come  in  early.  However,  in  net, 
discipline  problems  and  class  management  problems  are  much  less  in  a  tutored  classroom. 
There  is  a  sense  of  satisfaction  in  progress  and  achievement.  Visitors  to  the  classroom  are 
struck  by  the  fact  that  students  are  absorbed  in  the  learning  tasks  through  the  whole  period. 
Teachers  particularly  remark  on  the  success  with  minority  students  who  are  frequently 
alienated  in  conventional  classrooms.  It  is  our  belief  that  students  receive  our  tutors 
favorably  to  the  degree  that  our  tutors  achieve  their  fundamental  claim — to  embody  an 
accurate  cognitive  model  of  the  details  of  the  problem  solving.  If  so,  the  interactions  with 
the  tutor  are  largely  congruent  with  the  student’s  thinking  and  when  the  interactions  are  not 
congruent  they  point  the  student  in  the  right  direction.  While  students  do  not  consciously 
assess  the  system  in  terms  of  its  cognitive  fidelity,  they  are  very  aware  of  the  resulting 
smoothness  of  their  trajectory  through  learning  curriculum.  A  sense  of  growing 
competence  in  a  challenging  problem  domain  is  something  that  most  people  respond  to 
positively. 

We  have  been  impressed  by  the  relative  ease  of  management  in  our  tutored  classrooms.  If 

one  provides  teachers  with  a  couple  of  weeks  of  familiarization  with  the  basic  software, 

they  seem  to  adapt  comfortably  to  the  tutored  classroom.  This  contrasts  sharply  with 

many  efforts  at  classroom  reform  which  teachers  report  to  be  quite  exhausting  in  terms  of 

^^However,  it  may  take  much  longer  before  they  really  use  the  tutor  to  its  full  effectiveness.  There  is 
some  evidence  that  achievement  gains  are  higher  the  second  year  teachers  work  with  the  tutor. 
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the  demands  being  placed  on  them.  Our  classiooms  are  relatively  easy  because  the  tutor  is 
doing  all  of  the  bookkeeping  and  low-level  instruction  associated  with  the  classroom  and 
the  teacher  can  focus  on  giving  one-on-one  tutoring  to  students  for  whom  the  computer  is 
not  adequate.  They  generally  find  this  a  satisfying  role  and  one  that  enhances  their 
classroom  esteem  as  subject-matter  experts.  This  works  because  our  tutors  are  engaging 
and  so  other  students  are  on  task  when  a  teacher  is  giving  one  student  individual 
attention. 


Reflections 


We  have  come  a  long  way  from  our  original  goal  of  putting  ACT*  to  a  tough  test.  There 
certainly  has  been  a  harvest  of  empirical  data  which  has  played  a  major  role  in  leading  to  the 
new  ACT-R  theory  (see  Anderson,  1993).  We  have  totally  abandoned  our  original 
conception  of  tutoring  as  human  emulation.  We  now  conceive  of  a  tutor  as  a  learning 
environment  in  which  helpful  information  can  be  provided  and  useful  problems  can  be 
selected.  We  are  able  to  take  actions  that  facilitate  learning  because  we  possess  a  cognitive 
model  of  where  the  student  is  in  that  task. 

While  we  have  not  abandoned  our  goals  of  contributing  to  the  understanding  of  human 
cognition,  we  have  been  drawn  by  application  to  issues  that  are  far  afield.  Particularly  with 
the  tutor  development  kit  and  large  scale  applications,  we  find  ourselves  addressing  issues 
of  software  engineering.  Although  we  have  tried  to  place  content  decisions  in  the  hands  of 
our  “clients,”  we  inevitably  are  drawn  into  issues  about  the  content  and  purpose  of  high 


l^This  only  works  if  our  tutors  are  bug-free,  easy  to  use,  and  allow  for  error  recovery  from  things  like 
machine  crashes.  Typical  teachers  become  very  frustrated  when  their  interactions  with  the  student  must 
focus  on  the  computer  or  the  tutor,  rather  than  the  subject  domain  being  taught. 
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school  education.  Finally,  there  are  important  social  phenomena  in  our  classrooms,  critical 
to  the  success  of  our  tutors,  which  we  need  to  understand. 

Many  such  issues  may  ultimately  prove  more  important  to  the  success  of  our  tutors  than 
their  cognitive  fidelity.  However,  we  are  impressed  that  ten  years  later  the  general 
cognitive  modeling  approach  still  seems  viable  and  important  in  our  new  applications. 


The  Curriculum  Issue  ^ 

Rather  than  conclude  this  paper  on  this  self-congratulatory  note,  we  have  been  persuaded  to 
address  some  of  the  senses  of  unease  that  some  people  have  with  our  efforts.  There  are 
probably  many  dimensions  to  their  unease  but  the  reviewers  of  our  paper  have  gotten  us  to 
focus  on  issues  surrounding  the  nature  of  the  curriculum  our  tutors  deliver.  We  have  taken 
the  liberty  of  quoting  three  of  their  remarks  and  then  commenting. 

“I  would  like  to  believe  that  a  decade  of  research  in  this  area  has  given  the  authors  a 
solid  perspective  on  what  to  teach,  how  to  teach  it,  and  how  to  assess  the  effect  of 
that  instruction.  Instead  of  providing  guidance  to  educators  in  these  areas,  the 
authors  seem  willing  to  abrogate  this  responsibility  and  to  settle  into  the  role  of 
technologists,  teaching  what  the  current  curriculum  dictates  regardless  of  the 
appropriateness.  ” 

We  have  stated  strong  opinions  about  how  to  assess  the  effect  (time  to  reach  a  prescribed 
level  of  achievement),  but  otherwise  the  reviewer  is  correct  in  the  assertion. Our  ten 
years  of  experience  have  in  fact  given  us  no  basis  for  offering  advice  on  some  of  the  key 

*®Although  assenting  to  the  reviewer’s  comment  requires  interpreting  “current  curriculum”  to  Pittsburgh’s 
algebra  and  geometry  curriculum  that  follows  and  in  some  ways  goes  beyond  the  NCTM  standards. 
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issues  facing  the  City  of  Pittsburgh  in  deciding  about  its  mathematics  curriculum.  Some  of 
these  issues  have  been  very  responsibly  addressed  by  things  like  the  NCTM  standards  and 
the  city  has  adopted  these.  However,  frankly,  these  standards  leave  open  some  important 
issues  facing  mathematics  education  in  an  urban  setting.  An  example  of  such  an  issue  is  to 
what  degree  should  the  curriculum  be  teaching  students  “employable”  skills  (like  using 
spreadsheets)  versus  to  what  degree  should  the  curriculum  be  preparing  students  for 
college  mathematics  (e.g.,  being  prepared  to  understand  the  proof  of  the  fundamental 
theorem  of  calculus).  As  citizens  we  may  hold  opinions  on  this  issue,  but  nothing  in  our 
tutoring  work  informs  us  on  the  relative  value  of  different  educational  goals. 

“Underlying  the  project  of  tutor  construction  is  the  conviction  that  the  subject  matter 
can  be  represented  as  a  production  set.  I  cannot  repress  the  suspicion  that  the 
particular  choices  of  subject  matter  made  by  the  tutor  authors  reflect,  whether 
consciously  or  not,  this  conviction  and  thus,  that  the  material  which  lends  itself  less 
well  to  the  theoretical  framework  is  left  unconsidered.” 

As  the  first  remark,  this  reviewer  is  substantially  correct  in  the  assertion  although  we  again 
see  somewhat  different  implications  in  it.  Our  tutors  have  their  largest  potential  impact 
when  there  is  a  substantial  production-rule  component.  Thus,  we  have  stayed  away  from 
teaching  simple  algorithmic  skills  like  addition  and  focused  on  high  school  mathematics 
because  it  seemed  that  this  is  where  the  largest  impact  would  be.  Some  “advanced” 
domains  are  largely  declarative.  So,  we  have  considered  a  tutor  for  cognitive  psychology 
but  have  concluded  that  the  body  of  knowledge  typically  taught  in  an  undergraduate  course 
is  largely  factual  and  very  limited  inferential  chains  are  required.  (We  think,  by  the  way,  in 
our  role  as  citizens  of  the  field  and  as  an  author  of  a  cognitive  psychology  text,  that  this 
reflects  a  serious  indictment  of  instructional  goals  in  the  field).  It  is  also  the  case  that 
developing  cognitive  modeling  is  an  expensive  enterprise  (we  estimate  10  or  more  hours 
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per  production  rule)  and  it  may  not  be  economical  or  feasible  to  model  all  the  competence 
involved  within  more  advanced  areas  of  mathematics. 

However,  while  there  are  these  practical  limitations,  we  do  not  believe  that  there  are  any 
fundamental  limitations  of  the  approach.  For  instance,  we  are  currently  working  on 
developing  a  tutor  for  exploration  and  discovery  in  the  context  of  geometry  since  these  are 
typically  thought  to  be  skills  outside  the  domain  of  our  tutor.  While  we  do  not  have  any 
educational  results,  we  can  report  the  skills  are  perfectly  capable  of  being  modeled  within  a 
production  system  framework  (as  indeed  earlier  research  would  have  indicated — e.g., 
Klahr  &  Carver,  1988). 

“I  find  the  production-rule-based  approach  to  subject  matter...well...less  than  fun. 

It  would  be  hard  for  me  to  believe  that  a  student  would  choose  to  study  geometry 
on  his  or  her  own — would  go  home  and  start  playing  with  geometric  shapes  or 
cutouts  or  models  or  whatever — based  on  experience  with  these  tutors.” 

We  think  it  is  easy  to  underestimate  the  motivational  gains  produced  by  the  simple 
experience  of  learning  achievement.  The  principal  reason  for  the  enthusiasm  for  our  tutors 
within  the  Pittsburgh  Public  School  System  is  motivational  gains  not  achievement  gains. 
Perhaps  our  favorite  anecdote  is  about  one  student  in  a  school  in  another  state  that  had  the 
LISP  tutor.  The  student,  frustrated  by  restrictive  access  to  the  LISP  tutor,  deliberately 
induced  a  two-day  suspension  by  swearing  at  a  teacher.  He  used  those  two  days  to  dial 
into  the  school  computer  from  his  home  and  complete  the  lesson  material  on  the  LISP  tutor. 

While  the  issue  of  the  content  of  the  curriculum  is  essential,  learning  achievement  is  a  very 
empowering  experience.  Thinking  back  on  one’s  own  learning  experiences  and  the 
environment  one  learned  in,  it  is  easy  to  take  learning  for  granted  and  only  focus  on  what  is 
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being  learned.  The  fact  that  something  will  be  learned  cannot  be  taken  for  granted  in  many 
American  schools. 
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Figure  Captions 


Figure  1:  The  appearance  of  the  LISP  tutor  screen  at  the  beginning  of  a  coding 

problem. 

Figure  2:  A  screen  image  from  the  geometry  tutor  showing  the  proof  graph 

formalism.  The  givens  of  the  problem  are  at  the  bottom  of  the  screen  and  the  statement  to 
be  proven  at  the  top. 

Figure  3:  The  box  notation  for  representing  the  hierarchical  algorithmic  decomposition 

in  the  algebra  tutor. 

Figure  4:  Time  to  write  a  piece  of  code  in  the  LISP  tutor  for  productions  new  to  a 

lesson  and  old  productions.  Similar  functions  are  obtained  for  error  rates. 

Figure  5:  Mean  exercise  completion  time  for  five  tutor  lessons. 
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TABLE  1 


The  Problem  Description 


Define  a  function  called  "create-list"  that  accepts  one  argument,  which  must  be  a 
positive  integer.  This  function  returns  a  list  of  all  the  integers  between  1  and  the 
value  of  the  argument,  in  ascending  order.  For  example, 

(create-list  8)  returns  (12  3  4  5  67  8). 

You  should  count  down  in  this  function,  so  that  you  can  just  insert  each  new 
number  into  the  front  of  the  result  variable. 

Cycles  in  which  the  tutor  communicates  with  the  student 

1.  (defun  create-list  (num) 

(loop) 

You  will  need  local  variables  in  this  function.  So  before  you  code  anything  else, 
there  is  a  particular  LISP  operator  you  need  to  code. 


2.  (defun  create-list  (num) 

(let  (( ) 

<BODY>)) 

Which  of  the  following  do  you  want  to  do? 

PRESS:  IF  YOU  WANT  TO 

1  Initialize  a  variable  to  hold  the  list  of  digits  between 

1  and  the  value  of  the  parameter. 

2  Initialize  a  counter  variable  to  keep  track  of  how 

many  times  we  go  around  the  loop. 

3  Have  ^e  tutor  choose. 


Menu  choice:  2 

3.  (defun  create-list  (num) 

(let  ((count  1) 

<OTHER-INrnALIZATIONS>) 

<BODY>)) 

continued 
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In  this  problem  we  won't  want  to  start  counting  at  the  same  value  each  time  we 
call  the  function,  so  we  can’t  initialize  the  counter  to  a  constant  like  1.  How  is  the 
starting  value  being  communicated  to  the  function? 


4.  (defun  create-list  (num) 

(let  ((count  num) 

(numlist  (list  num))) 
(lop)) 

Did  you  mean  loop  (y  or  n)?  y 


5.  (defun  create-list  (num) 

(let  ((count  num) 

(numlist  (list  num))) 

(loop 

(cond  ((equal  count  1)  numlist)) 

<LOOP-ACnONS2>))) 

You  do  want  to  return  the  variable  "numlist"  here,  but  just  typing  it  into  the  code 
won’t  accomplish  that.  What  LISP  operator  will  allow  you  to  return  something 
from  a  LOOP? 


6.  (defun  create-list  (num) 

(let  ((count  num) 

(numlist  (list  num))) 

(loop 

(cond  ((equal  count  1)  (return  numlist))) 

<LOOP-ACTIONS2>))) 

In  numeric  iteration  the  first  thing  you  should  do  after  the  exit  test  is  update  the 
counter  variable.  You  can  use  the  function  SETQ  to  do  this. 


7.  (defun  create-list  (num) 

(let  ((count) 

(numlist  (list  num))) 

(loop 

(cond  ((equal  count  1)  (return  numlist))) 
(setq  count  (1-  count)) 

(cons))) 


continued 
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You’ve  got  the  right  idea,  since  you'll  need  to  use  the  function  "cons",  but 
remember  you  are  trying  to  insert  the  new  value  of  the  counter  into  the  list.  What 
LISP  function  allows  you  to  assign  the  results  of  this  function  call  to  a  local 
variable? 


8.  (defun  create-list  (num) 

(let  ((count  num) 

(numlist  (list  num))) 

(loop 

(cond  ((equal  count  1)  (return  numlist) 

(setq  count  (1-  count) 

(setq  numlist  (list))) 

If  you  LIST  together  arguments  like  7  and  (8)  you  get  (7  (8)).  what  you  want  is 
(7  8).  Try  another  function  that  combines  things  together. 


9.  (defun  create-list  (num) 

(let  ((count  num) 

(numlist  (list  num))) 

(loop 

(cond  ((equal  count  1)  (return  numlist))) 

(setq  count  (1-  count) 

(setq  numlist  (cons  count  numlist))))) 

- YOU  ARE  DONE.  TYPE  NEXT  TO  GO  ON  AFTER— 

—TESTING  THE  FUNCTION  YOU  HAVE  DEFINED— 

THE  LISP  WINDOW 


=>  (create-list  10) 
(12  3  4567  89  10) 
=>  next 


Table  1.  A  Teletype  Depiction  of  a  Hypothetical  Student  Completing  the  Code  for 
the  Function  create-list  with  the  Tutor. 
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FIGURE  1 


0«fin»  •  function  eollod  *crMto>lttt*  t>it  Moopts  ono 
•rgumonl.  whicti  mutt  bo  a  potKtvo  inlopor.  This  function 
rotumt  a  list  of  aN  tho  intogort  botwoon  1  and  tho  valu#  of 
Iho  a/gumont,  In  atoonding  ordor.  For  oxamplo. 

(croato-Mot  8)  lotuma  (1  2  3  4  S  8  7  8). 

You  should  count  down  in  this  function,  so  that  you  can  just 
inaart  aach  now  numbor  Mo  tho  frortt  of  tho  roouN  variablo. 


CODE  for  crooto-litt 


(dofun  croato-litt  <parafnotort> 
<proooot>) 


FIGURES 


